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Chapter  2 

^  Shaded  relief  map  of  the  area  around  PFO,  CA.  The  San  Jacinto  and  San 
Andreas  faults  are  shown  to  the  southwest  and  northeast  of  PFO 
respectively.  Some  smaller  cross-faults  near  PFO,  referred  to  in  the  text,  are 
included  (from  Rogers,  1965).  The  circle  about  PFO  is  of  10  km  radius.  The 
sea  level  contour  is  indicated  by  a  dashed  line. 


Figure  2.2  . y 

Station-centered  plot  of  117  events  at  PFO  by  ray  parameter  and 
backazimuth.  Circles,  from  the  outermost  inward,  are  at  0.08,  0.06.  and  0.04 
sec/km.  Nearby  events  have  been  grouped  according  to  ray  parameter  and 
backazimuth  (the  events  in  each  group  are  encircled).  The  numbers 
correspond  to  the  receiver  functions  in  figure  8.  Isolated  events  are  used  in 
the  individual  analyses  discussed  in  the  text. 


Figure  2  3 

Receiver  functions  calculated  for  events  binned  by  ray  parameter  only.  The 
average  distance  in  degrees  for  each  set  of  events  is  to  the  left  of  each  trace. 


Figure  2.4 


Incremental  development  of  a  simple  velocity  model.  On  the  right  side,  the 
receiver  function  from  the  steepest  incidence  angle  group  of  figure  3  is 
plotted  as  a  solid  line  and  the  synthetic  receiver  function  corresponding  to 
the  velocity  model  shown  to  the  left  is  plotted  as  a  dashed  line.  Phases 
indicated  in  the  synthetics  are  due  to  the  newest  feature  of  each  successive 
model. 
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13 


Figure  2.5  . . . •; . y 

Two  velocity  models  which  fit  the  data  equally  well  based  on  forward 

modeling. 

Figure  2.6  . . . •*** 

Synthetic  receiver  functions  for  the  simple  (dotted)  and  modified  (dashed) 
models  compared  to  the  data  (solid).  Comparisons  are  made  for  the  steepest 
(bottom)  and  second  steepest  incidence  (top)  group  receiver  functions  of 
figure  3. 
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IX 


Figure  2.7  .  18 

Ray  paths  of  converted  and  reverberated  phases  for  horizontal  and  dipping 
interfaces.  Note  the  difference  in  horizontal  distance  from  the  station 
sampled  by  similar  phases  updip  relative  to  downdip. 

Figure  2.8  .  19 


Receiver  functions,  low-pass  filtered  below  1  hertz,  for  the  event  groups 
pictured  in  figure  2,  with  group  numbers  of  figure  2  above  each  trace. 
Average  backazimuth  and  distance  for  each  group  is  to  the  left  of  the  traces. 

Note  the  variability  of  the  Moho  Ps-P  time  at  about  4  seconds  (downward 
arrow).  Upward  arrow  indicates  negative  peak  after  Moho  Ps. 

Figure  2.9  .  20 

Moho  Ps-P  times  for  individual  events,  plotted  on  the  same  station  centered 
plot  as  in  figure  2.  The  distinctive  pattern  of  longer  times  to  the  northwest 
and  shorter  times  to  the  southeast  in  the  data  (top  plot)  is  well  matched  by 
synthetics  calculated  for  a  32  km  deep  6.3  km/sec  layer  over  a  half  space 
dipping  20  degrees  to  the  northwest. 

Figure  2.10  .  21 

Contour  plot  of  the  demeaned  one  norm  of  observed-minus-predicted  Moho 
Ps-P  times  in  seconds,  for  dipping  planar  layer  over  a  half  space  models 
ranging  over  all  strikes  and  dips  up  to  300  in  50  increments.  The  minimum  is 
at  15  to  20  degrees  northwest  dip. 

Figure  2. 11  . 22 

Demeaned  Moho  Ps-P  time  residuals  for  a  6.3  km/sec  32  km  deep  layer  over 
a  half  space  dipping  20  degrees  northwest  (left).  Points  are  plotted  at  the 
horizontal  distance  and  backazimuth  from  the  station  that  the  Ps  converted 
phase  would  be  generated  at  the  Moho  and  are  scaled  by  the  size  of  the 
residual.  Note  that  for  this  model,  all  rays  cross  the  Moho  directly  beneath  or 
to  the  southeast  of  the  station.  The  dashed  line  running  120  south  of  east 
indicates  the  position  of  one  of  the  cross  sections  of  figure  17.  Demeaned 
Moho  Ps-P  time  residuals  relative  to  a  6.3  km/sec  30  km  deep  horizontal 
layer  over  a  half  space  (right).  Note  not  only  the  greater  consistency  in 
residual  times  for  the  dipping  model,  but  what  different  areas  of  the  Moho 
are  sampled  in  each  model. 

Figure  2.12  .  25 

Moho  Ps/P  amplitude  ratios  for  receiver  functions  of  groups  of  figure  2,  on 
the  same  station  centered  plot.  Azimuthal  variation  observed  in  the  data 
(left)  is  roughly  matched  by  synthetics  for  the  same  6.3  km/sec  32  km  deep 
layer  over  a  half  space  dipping  20  degrees  northwest.  Absolute  values  of  the 
amplitudes  are  not  important  (see  the  discussion  regarding  uncertainty). 

Only  relative  amplitudes  at  different  azimuths  are  used  to  infer  dip  direction. 

Plus  symbols  indicate  reversed  polarity  of  the  Ps  arrival. 
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Figure  2.13  . . . . . . » 

Normalized  P  waves  (vertical  recordings)  of  impulsive  events  from  group  5 
of  figure  2  (top  five  traces),  aligned  by  arrival  time,  and  their  stack  (bottom 
trace). 


27 


UIU  — . 

Stacks  for  the  4  groups  containing  impulsive  events.  PpPmp-P  times  are 
listed  above  the  presumed  PpPmp  arrival  of  each  trace.  The  position  of  each 
group  is  indicated,  with  the  circle  scaled  by  the  residual  relative  to  the  flat¬ 
ting  6.3  km/sec  30  km  deep  layer  over  a  half  space  of  figure  1 1  (so  as  to 
account  for  the  effect  of  variations  in  ray  parameter  between  groups). 


Tangential  component  of  mislocation  vectors  for  initial  P  waves  at  PFO 
(top)  and  for  a  layer  over  a  half  space  dipping  20  degrees  towards  S40W 
(bottom).  Circles  are  at  100,  250,  and  400  incidence.  Predicted  positions  of 
the  events  are  plotted  as  disks.  Arrows  point  to  the  measured  positions  of  the 
events.  Eighteen  of  twenty  one  mislocation  vectors  in  the  northwest 
quadrant  are  pointing  clockwise,  with  a  mean  of  nearly  5  degrees,  the 
southeast  quadrant  has  a  mean  tangential  mislocation  of  2.5  degrees 
counterclockwise  with  sixteen  of  twenty  one  events  consistent  in  direction, 
while  the  southwest  quadrant  mislocation  vectors  are  much  less  consistent 
and  have  a  mean  of  1.5  degrees  clockwise. 


Radial  (solid  line,  upper  trace)  and  tangential  (solid  line,  second  trace) 
receiver  functions  of  group  5  (from  figure  2),  and  their  point-by-point 
product,  indicating  their  coherence  (solid  line,  lower  trace).  Synthetic 
receiver  functions  and  their  coherence  (dashed  lines)  are  shown  below  the 
data  traces,  for  the  refined  model  of  figure  5  in  which  the  shallowest 
discontinuity  dips  150  southward.  Vertical  lines  through  the  traces  delineate 
the  windows  for  which  particle  motion  is  plotted  below  (the  upper  row  is  the 
for  the  data  and  the  lower  row  is  for  the  synthetics). 


North-south  (upper  left)  and  nearly  east-west  (upper  right)  topographic 
cross-sections  through  PFO.  The  position  of  the  second  cross-section  is 
indicated  by  a  dashed  line  through  the  center  of  the  circle  in  the  left  side  of 
figure  11.  PFO's  position  is  indicated  by  the  triangle.  The  lower  plots 
indicate  Moho  depth  for  each  cross-section  inferred  by  assuming  Airy 
isostasy  with  the  depth  of  compensation  at  the  Moho.  The  short  horizontal 
bars  are  estimates  of  Moho  depth  based  on  receiver  function 


xi 


each  signal  to  noise  level  (the  maximum  signal  peak  to  rms  pre-event  noise 
levels,  S/N,  are  listed  to  the  left  of  each  set  of  traces).  The  receiver  functions 
for  25’ such  synthetic  seismograms  (with  different  source  functions  and  noise 
windows,  but  the  same  S/N  levels),  are  shown  on  the  right,  with  the  mean 
and  2  standard  deviations  uncertainty  listed  for  the  P  and  Ps  amplitudes  and 
the  Ps-P  timing  error  of  each  set. 


p'  ^  y  . . . . . * 

Comparison  of  a  stack  of  25,  S/N=10,  receiver  functions  (bottom  right)  with 
a  synthetic  receiver  function  for  the  same  layer  over  a  half  space  model  after 
convolution  with  the  averaging  function  (averaging  function  is  upper  left 
and  convolution  with  synthetic  is  upper  right).  The  sidelobes  of  the 
averaging  function  are  very  much  smaller  than  those  of  the  receiver  function. 
A  more  accurate  estimate  of  the  effects  of  deconvolution  on  the  receiver 
function  is  shown  on  the  lower  left  (discussed  in  text). 
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Figure  3.8  . . . : . : . . 

Eigenvectors  of  the  ATA  matrix  (where  A  is  the  convolution  matrix  of  the 

vertical  seismograms)  for  the  receiver  function  shown  in  figure  1.  The 
smallest  eigenvalues  are  associated  with  high  frequency  oscillations. 

Figure  3.9  . . . . . yy 

a)  Velocity  model  used  for  OBN  data,  b)  Fit  of  the  synthetic  from  the  model 

in  (a)  (dashed  line)  to  the  OBN  data  (solid  line). 

Figure  3.10  . . . . . •—•••• 

VSS  for  data  recorded  at  OBN.  Upper  left,  view  m  3-space.  Upper  right, 
view  in  Vp-depth  space.  Lower  left,  view  in  Vs-depth  space.  Lower  right, 
view  in  Vp-Vs  space. 

Figure  3.11  . . . : . . . •”••••••••••• 

a)  The  simple  velocity  model  adopted  for  demonstration  of  the  effects  of 
moveout  with  ray  parameter  (solid  line)  and  ±10%  perturbations  to  that 
model  (dashed  lines),  b)  Receiver  functions  calculated  for  the  models  of 
figure  11a,  for  a  ray  parameter  of  0.04.  c)  The  same  for  a  ray  parameter  of 
0.06,  and  d)  the  same  for  a  ray  parameter  of  0.08. 

Figure  3.12  . . . •••••; . yy 

a)  The  same  velocity  model  of  figure  1  la  (solid  line),  and  a  velocity  model 
with  2%  lower  S-wave  velocity  than  the  dashed  line  model  of  figure  1  la.  b) 
Synthetic  receiver  functions  for  the  velocity  models  of  (a),  showing  no 
difference  in  arrival  times. 
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Figure  3.13  . yy . 77 . ,7 

Receiver  functions  for  the  same  layer  over  a  half  space  model  used  for  all 
synthetics  in  this  paper.  The  solid  line  shows  the  ideal  receiver  function.  The 
long  and  short  dashed  lines  respectively  show  receiver  functions  for  which 
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the  ray  arrives  with  plus  and  minus  3  degrees  difference  in  incidence  angle 
from  that  expected.  The  P  and  Ps  amplitudes  for  each  are  shown. 


Chapter  4 

Figure  4.1  .  84 

Peak  log(Lg/Pg)  amplitude  ratios  for  a  subset  of  SCSN  data,  corrected  to  the 
WWSSN  response.  Earthquake  records  are  plotted  as  crosses  and  explosion 
records  are  plotted  as  circles.  The  discrimination  line  separating  earthquakes 
and  explosions  matches  that  found  by  Taylor  et.  al.  (1986)  for  the  western 
U.S.  We  will  address  whether  the  scatter  and  misidentification  (i.e.  symbols 
on  the  “wrong  side”  of  the  line)  is  due  to  identifiable  propagation  effects. 


Figure  4.2  .  86 

Map  of  southern  California  seismic  network  stations  (triangles)  with 
regional  earthquakes  (asterisks)  and  explosions  (circles)  that  are  discussed  in 
the  chapter. 

Figure  4.3  .  87 


Results  of  log(Lg/Pg)  discrimination  using  southern  California  network 
recordings  for  4  nuclear  explosions.  Crosses  indicate  identification  of  the 
source  as  an  earthquake.  Circles  indicate  identification  as  an  explosion. 
Symbols  are  scaled  by  their  distance  from  the  discrimination  level  of  figure 
1.  The  smaller  explosions,  Floydada  (1991,  day  227,  figure  3a)  and  Coso 
(1991,  day  67,  figure  3b),  are  misclassified  more  frequently  than  the  larger 
explosions,  Lubbock  (1991,  day  291,  figure  3c)  and  Hoya  (1991,  day  257, 
figure  3d).  The  sampling  is  biased  because  of  the  limited  dynamic  range  of 
the  instruments,  leading  to  clipped  records  at  short  distances  for  large  events, 
and  low  signal  to  noise  level  records  at  long  distances  for  small  events. 

Figure  4.4  .  88 

To  examine  possible  geographic  variation  in  the  pattern  of  log(Lg/Pg) 
amplitudes,  we  standardize  the  distribution  of  each  event’s  discriminant 
values  (i.e.  remove  the  mean  and  normalize  so  that  one  standard  deviation 
equals  1 .0).  The  result  above  is  for  Hoya  (figure  3d).  Here,  symbol  size  is 
scaled  by  proximity  to  the  mean,  with  crosses  positive  and  circles  negative. 

Note  that  the  crosses  appear  to  be  in  the  same  region  in  which 
misclassifications  occurred  for  the  smaller  explosions. 

Figure  4.5  .  88 

The  normalization  described  in  figure  4  was  performed  for  10  nuclear 
explosions.  The  mean  for  each  station,  for  all  the  events  it  recorded,  was 
then  plotted  above,  revealing  a  distinct  geographic  pattern.  Although  we 
have  ignored  the  possible  effects  of  the  bias  in  sampling  (nearer  stations 
recording  more  small  events  and  more  distant  stations  recording  larger 
events),  inspection  of  results  for  individual  events  suggests  that  the  pattern  is 
robust.  That  the  pattern  appears  to  be  common  to  all  events,  whose  sources 
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span  50  km  within  NTS,  argues  against  near  source  scattering  strongly 
affecting  the  discriminant  values. 


Figure  4.6  . . . 

Log(Lg/Pg)  amplitude  ratios  for  earthquake  no.  4  (Mb  =  3.8)  of  figure  2.  As 
in  figure  3,  crosses  are  “earthquake-like”  values  scaled  by  distance  above  the 
discrimination  line  of  figure  1,  and  circles  represent  “explosion-like”  values 
scaled  by  distance  below  the  line.  For  this  shallow  earthquake  (~  2  km 
depth),  most  (100  out  of  150)  recordings  misclassify  the  event  as  an 
explosion.  This  serves  as  a  warning  that,  even  if  we  develop  perfect  path 
corrections,  there  will  be  anomalous  events. 
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Figure  4.7  . . . . . . "" 

Normalized  log(Lg/Pg)  amplitude  ratios,  as  in  figure  4,  for  earthquake 
number  4  (figures  2  and  6).  The  relative  pattern  of  large  and  small  Lg  to  Pg 
amplitude  ratios  is  quite  similar  to  that  observed  for  the  nuclear  explosions 
(figure  5),  further  indicating  that  neither  source  radiation  or  near  source 
scattering  is  important  to  the  pattern  of  Lg/Pg  amplitude  ratio  variations. 
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Figure  4.8  . . . 

Event  classification  as  in  figure  6,  for  earthquake  no.  3  (Mb  =  4.4)  of  figure 
2.  All  of  the  stations  at  which  the  event  is  misclassified  are  clustered  at  the 
greatest  distance  from  the  source.  Although  this  alone  might  suggest  a 
potential  distance,  or  near-receiver  scattering,  or  receiver  site  effect,  we  find 
a  very  different  pattern  for  earthquake  no.  1  (figure  9). 

Figure  4.9  . . 

Event  classification  as  in  figures  6  and  8  for  earthquake  no.  1  (Mb  =  4.2)  of 
figure  2.  In  contrast  to  figure  8,  the  distinct  area  of  misclassifications  is 
nearest  the  source,  and  at  a  completely  different  set  of  stations.  Taken 
together,  this  figure  and  figure  8  indicate  that  neither  distance,  near-receiver 
scattering,  or  site  effects  control  the  pattern  of  relative  Lg  to  Pg  amplitudes 
in  any  simple  way. 
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Figure  4.10  . . . •••••■ 

Event  classification  as  in  figures  6,  8,  and  9,  for  earthquake  no.  6  (Mb  =  4.2) 
of  figure  2.  There  are  almost  no  misclassifications  for  this  event.  The  two 
areas  in  which  the  smaller  amplitude  ratios  cluster  (in  the  center  and  at  the 
northwest  edge  of  the  network),  are  distinct  from  the  areas  of  smaller 
amplitude  ratio  for  the  events  from  different  azimuths  (figures  6,  8,  and  9), 
strengthening  the  argument  against  any  influence  of  distance,  near-receiver 
scattering,  or  site  effects. 
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Figure  4. 1 1  . . . i"“ 

Pn  first  motion  polarity  for  earthquake  number  1  (figures  2  and  9).  Triangles 
indicate  positive  first  motion,  and  circles,  negative.  If  the  source  radiation 
controlled  the  pattern  of  Lg/Pg  amplitude  ratio  variation  observed,  we  might 
expect  higher  Lg/Pg  amplitude  ratios  to  be  recorded  along  P-nodal  lines. 
Here,  the  polarities  suggest  a  possible  P-nodal  plane  running  roughly 
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northwest-southeast,  correlating  in  no  way  with  the  amplitude  ratio  pattern 
observed. 

Figure  4.12  .  92 

Event  classification  as  in  figures  6,  8,  9,  and  10,  for  earthquake  no.  2.  There 
were  relatively  fewer  observations  for  this  Mb  =  5.4  event,  due  to  clipping, 
than  were  available  for  the  nearby  earthquake  no.  3  (figure  8),  but  the  pattern 
of  misclassifications  is  similar.  We  can  compare  this  result  to  one  predicted 
by  the  known  source  focal  mechanism. 

Figure  4. 13  . . .  92 

Ratio  of  predicted  S  to  P  amplitudes  radiated  from  the  source  of  earthquake 
no.  2.  The  relative  amplitudes  have  been  normalized  as  in  figure  4.  The 
largest  crosses  are  in  the  vicinity  of  the  P  node,  just  the  opposite  of  the 
observed  pattern,  suggesting  that  at  least  at  approximately  1  Hz  and  in  that 
distance  range,  the  focal  mechanism  has  little  effect  on  the  observed  Lg  and 
Pg  amplitudes. 
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Figure  5.1  . . .  106 

Penalty  functions  (upper  left)  for  misfit  of  least  squares,  Li-norm,  and 
Hampel  17a  solutions  and  their  1st  and  2nd  derivatives,  i.e.,  influence 
functions  (upper  right)  and  the  values  for  robust  weights  (lower  left).  Note 
that  the  Li-norm  does  not  have  a  true  influence  function,  as  its  robust 
weights  would  approach  infinity  near  zero  misfit.  It  is  included  here  for 
comparison  with  the  other  solutions. 

Table  5.1  .  110 

Differences  in  mean  errors  between  the  least  squares  (second  and  third 
columns)  and  robust  (rightmost  two  columns)  solutions  to  synthetic  data 
with  normal  (top  row)  and  exponential  (second  row)  noise,  and  large  outliers 
(bottom  rows). 

Figure  5.2  .  112 

Quantile-quantile  plots  for  the  misfits  from  the  least  squares  (top),  and 
robustly  reweighted  solution  (bottom).  If  the  misfits  were  normally 
distributed,  the  points  would  lie  along  the  diagonal  plotted.  The  large 
deviation  of  points  from  the  diagonal,  for  the  least  squares  misfits,  beginning 
at  approximately  2  standard  deviations,  indicates  that  the  distribution  of  the 
misfits  is  much  heavier-tailed  than  the  normal  distribution.  The  very  large 
outliers  most  likely  indicate  outright  blunders  in  the  data. 

Figure  5.3  .  113 

Plots  of  robust  weights  vs.  event  dates  (top  row)  for  two  of  the  stations  most 
affected  by  the  robust  reweighting,  showing  a  distinct  grouping  of  low 
weights  within  specific  time  periods.  Pre-event  noise  levels  vs.  event  dates 
(middle  row)  indicate  that  magnification  was  likely  significantly  lower  than 
was  recorded  in  instrument  parameter  logs  for  station  SBK  (left  column) 


xvi 


during  that  time  period.  The  robust  weights  are  plotted  vs.  the  pre-event 
noise  levels  in  the  bottom  row.  The  correlation  is  perfect  for  SBK,  but  more 
muddled  for  LUC  (right  column) 


Table  5.2  . . . 

Differences  between  site  amplifications  for  the  SCSN  with  and  without 

incorporating  censored  data. 


jn.A  .  . 

Differences  between  unweighted  and  a  priori  weighted  least  squares  site 
amplification  estimates.  The  robust  estimates  are  effectively  the  same 
whether  we  being  with  or  without  a  priori  weighting,  and  we  see  that  the  a 
priori  weighted  values  are  always  closer  to  the  presumably  more  accurate 
robustly  weighted  values. 
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Chapter  6 


Figure  6.1  . . 

Stations  of  the  southern  California  seismic 
amplifications  were  calculated. 


network  for  which  site 
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Figure  6.2  . . . : . . . ; . v . . 

Vertical  amplification  at  a  free  surface  for  incoming  P-waves  (top)  and  S- 
waves  (bottom)  as  a  function  of  incidence  angle. 


132 


Figure  6.3  . . . ..................... 

Map  of  teleseismic  sources.  Shallow  source  epicenters  are  indicated  by 
circles,  and  deep  source  epicenters  by  inverted  triangles.  Symbols  are  scaled 
by  magnitude. 

Figure  6.4  . . 

Correlation  coefficients  of  the  teleseismic  coda  from  21  to  61  seconds  after 
the  initial  P  arrival,  for  each  pair  of  SCSN  stations  recording  a  deep  event, 
vs.  interstation  spacing,  before  (top)  and  after  beam  removal  (bottom).  The 
mb  6.5,  606  km  deep  event  was  73°  from  southern  California.  The  mean 
correlation  coefficient  was  0.055  before  beam  removal  and  0.008  after,  and 
the  slope  was  virtually  zero,  indicating  complete  removal  of  the  coherent 
coda. 
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Figure  6.5  . 

Correlation  coefficients  of  the  teleseismic  coda  from  20  to  45  seconds  after 
the  initial  P  arrival,  for  each  pair  of  SCSN  stations  recording  a  shallow 
event,  vs.  interstation  spacing,  before  (top)  and  after  beam  removal  (bottom). 
The  mb  6.2, 10  km  deep  event  was  51°  from  southern  California.  The  mean 
correlation  coefficient  was  0.162  before  beam  removal  and  0.006  after,  and 
the  slope  was  virtually  zero,  indicating  complete  removal  of  the  coherent 
coda. 
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Figure  6.6  . 

Correlation  coefficients  of  the  teleseismic  coda  from  40  to  95  seconds  after 
the  initial  P  arrival,  for  each  pair  of  SCSN  stations  recording  a  shallow 
event,  vs.  interstation  spacing,  before  beam  removal  (top),  and  after  beam 
removal  (bottom).  The  Mb  6.5,  19  km  deep  event  was  53°  from  southern 
California.  The  mean  correlation  coefficient  was  0.431  before  beam  removal 
and  -0.005  after.  There  is  a  significant  increase  in  the  correlation  coefficients 
with  decreasing  interstation  spacing,  indicating  incomplete  removal  of  the 
coherent  coda.  The  zero  distance  intercept  was  0.052.  In  this  case,  the  coda 
window  chosen  included  the  PcP  arrival,  which  is  usually,  but  not  always, 
insignificant,  and  which  may  have  been  the  source  of  the  coherent  coda, 
although  PcP  was  not  visible  in  a  record  section.  Some  other  shallow  events 
had  even  more  coherent  coda,  with  no  apparent  cause.  Those  events  usually 
were  less  distant  than  the  average. 


Figure  6.7  .  149 

Record  section  for  the  shallow  event  with  the  largest  percentage  of  coherent 
coda.  The  event  was  reported  to  be  at  17  km  depth,  and  we  see  the  depth 
phases  pP  and  sP  as  predicted  for  that  depth,  at  approximately  6  and  8 
seconds  after  the  initial  P  arrival.  Some  energy  is  also  visible  at  18  seconds 
after  P,  the  time  predicted  for  PcP.  The  coda  window  used  was  from  25  to  80 
seconds  after  the  initial  P  arrival,  where  no  major  phases  were  predicted  to 
arrive.  However,  there  is  a  very  large  coherent  phase  with  similar  moveout 
to  the  initial  arrival,  at  about  45  seconds  after  the  initial  arrival. 

Figure  6.8  .  150 

Record  section  for  the  diffuse  coda  of  the  event  of  figure  7.  With  the  beam 
removed  from  each  individual  trace,  the  coherent  phases  that  were 
prominent  in  the  time  window  used  for  calibration  in  figure  7  are  no  longer 
apparent. 

Figure  6.9  .  151 

Correlation  coefficients  of  the  teleseismic  coda  from  25  to  80  seconds  after 
the  initial  P  arrival,  for  each  pair  of  SCSN  stations  recording  the  shallow 
event  of  figure  7.  The  increase  in  correlation  coefficient  with  decreasing 
station  spacing  indicates  that  not  all  of  the  steeply  incident  P-wave  energy 
was  coherent.  After  beam  removal,  the  zero  distance  intercept  was  0.056. 

The  large  decrease  however,  in  the  mean  value,  from  0.434  before  beam 
removal  to  -0.002  after,  indicates  that  most  of  the  coherent  coda  energy  was 
removed. 

Figure  6.10  .  153 

Comparison  of  diffuse  coda  site  amplifications  calculated  from  just  the  coda 
of  the  20  deep  events  and  those  calculated  from  just  the  coda  of  the  20 
shallow  events.  The  vertical  lines  represent  2  standard  deviations  uncertainty 
about  the  deep  event  coda  site  amplifications,  and  the  horizontal  lines 
represent  the  same  for  the  shallow  event  coda  site  amplifications. 
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Comparison  of  diffuse  coda  site  amplifications  versus  1.5  Hz  local  b-wave 
coda  site  amplifications  of  Su  and  Aki  (1995)  for  sites  with  amplifications 
less  than  3.0  (upper  left).  The  coefficient  of  determination  is  measure  of  how 
meaningful  it  is  to  relate  two  variables  by  a  sloping  line  (i.e.  Y=aX+b). 
Specifically,  the  coefficient  of  determination,  r2,  is  given  by  r2=l-SSE/SST, 

where  the  sum  of  the  squared  error,  SSE  =  'LyJ-  b'Zyi  -  a^xi  yt ,  is  a 
measure  of  how  much  variation  is  left  unexplained  by  the  model,  and  the 

total  squared  error,  SST  =  ~  (Xyj)  jn » a  measure  amoun^ 

of  variation  in  the  observed  values  of  the  dependent  variable.  Thus  SSE/SST 
is  the  proportion  of  the  total  variation  that  is  not  predicted  by  the  linear 
model,  and  r2  is  the  proportion  of  variation  in  the  dependent  variable  that  is 
predicted  by  the  linear  model.  The  correlation  between  the  local  S-wave 
coda  amplifications  at  1.5  Hz  and  the  diffuse  coda  amplifications  is  about  as 
good  as  between  the  local  S-wave  coda  amplifications  at  1.5  Hz  and  at  3  Hz 
(upper  right).  The  correlation  is  much  poorer  for  larger  differences  in 
frequency  at  the  same  site  (lower  plots). 


|41  \J4  1  ^  - - - - - 

Comparison  of  diffuse  coda  site  amplifications  versus  local  1.5  Hz  S-wave 
coda  site  amplifications  of  Su  and  Aki  (1995)  for  all  amplifications  (upper 
left).  The  linear  relationship  seen  in  figure  9  breaks  down  for  the  highest 
amplification  sites,  with  approximately  5  times  higher  amplification 
estimated  from  the  1.5  Hz  local  S-wave  coda  than  from  the  diffuse  coda.  The 
local  S-wave  coda  site  amplifications  are  also  much  greater  at  1 .5  and  3  Hz 
(upper  right),  than  at  6  and  12  Hz  (lower  plots). 
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Chapter  7 

Figure  7.1  . . . . . . 

Log(Lg/Pg)  amplitude  ratios  recorded  at  SCSN  stations  for  event  number 
one  of  figure  4.2.  The  values  are  demeaned,  with  values  greater  than  the 
mean  plotted  as  crosses  and  those  less  than  the  mean  plotted  as  circles. 
Symbols  are  scaled  by  distance  from  the  mean.  Arrows  indicate  the 
propagation  direction. 


Figure  7.2  . . . 

Lg  amplitudes,  after  correction  for  site  amplifications,  for  event  one. 
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Figure  7.3  . . . 

Pg  amplitudes,  after  correction  for  site  amplifications,  for  event  one. 
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Figure  7.4  . ••••• . 

Log(Lg/Pg)  amplitude  ratios  for  event  number  three  of  figure 
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Figure  7.5  .  172 

Lg  amplitudes,  after  correction  for  site  amplifications,  for  event  three. 

Figure  7.6  .  173 

Pg  amplitudes,  after  correction  for  site  amplifications,  for  event  three. 

Figure  7.7  .  174 

Log(Lg/Pg)  amplitude  ratios  for  a  nuclear  explosion  at  NTS. 

Figure  7.8  175 

Lg  amplitudes,  after  correction  for  site  amplifications,  for  the  NTS  event. 

Figure  7.9  .  176 

Pg  amplitudes,  after  correction  for  site  amplifications,  for  the  NTS  event. 

Figure  7.10  .  178 

Log(Lg/Pg)  amplitude  ratios  for  event  number  six  of  figure  4.2. 

Figure  7. 11  .  179 

Lg  amplitudes,  after  correction  for  site  amplifications,  for  event  six. 

Figure  7.12  . 180 

Pg  amplitudes,  after  correction  for  site  amplifications,  for  event  six. 
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Figure  1  . .  185 


The  top  panel  shows  receiver  functions  computed  by  frequency -domain 
deconvolution  of  14  events  at  Arti,  Russia  (ARU).  The  middle  panel  shows 
the  stack  of  these  14  receiver  functions  and  the  traces  representing  ±  two 
standard  deviations  of  the  mean.  The  bottom  panel  shows  the  receiver 
function  computed  using  simultaneous  time-domain  deconvolution  of  the 
same  14  events  and  the  traces  representing  ±  two  standard  deviations  of  the 
mean. 

Figure  2  .  187 

Misfit  versus  model-norm  size  for  various  Lagrange  multipliers  (p)  applied 
to  the  simultaneous  deconvolution  to  compute  the  receiver  function  shown 
on  the  bottom  of  Fig.  1.  The  misfit  is  defined  as  the  rms  difference  between 
the  observed  radial  component  of  the  seismograms  and  those  predicted  by  a 
reconvolution  of  the  receiver  function  with  the  vertical  component.  The 
model-norm  size  for  this  example  is  defined  as  the  rms  sum  of  all  the 
components  of  the  receiver  function.  To  compute  the  receiver  function 
shown  on  the  bottom  panel  of  Fig.  1  we  used  p  =  102. 

Table  1  .  187 

Velocity  model  used  to  generate  the  synthetic  seismograms. 
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Figure  3  . . . 

The  uppermost  ‘idealized  receiver  function’  was  produced  by  deconvolution 
of  noise-free  synthetic  seismograms  (assuming  a  delta-function  source).  The 
bottom  and  middle  receiver  functions  were  computed  by  simultaneous 
deconvolution  of  five  and  25  synthetic  seismograms  (described  in  the  text) 
respectively.  The  numbers  plotted  above  each  of  the  receiver  functions,  on 
this  as  well  as  all  the  following  figures,  are  the  rms  misfits  between  all  of  the 
observed  horizontal  components  of  the  seismograms  and  the  convolution 
products  of  the  receiver  function  and  the  respective  vertical  components. 
The  seismograms  were  weighted  for  the  rms  misfit  calculation  just  as  they 
were  for  the  deconvolution. 


Figure  4  . . 

The  uppermost  ‘idealized  receiver  function’  was  produced  by  deconvolution 
of  noise-free  synthetic  seismograms  (assuming  a  delta-function  source).  The 
second  and  third  receiver  functions  from  the  top  were  computed  by  stacking 
25  and  five  receiver  functions  (respectively),  each  computed  by  single-event 
frequency-domain  deconvolution  of  the  same  synthetic  seismograms  as  used 
to  produce  Fig.  2.  The  bottom  receiver  function  was  computed  by  the 
frequency-domain  deconvolution  of  the  uncut  vertical  components  (see  text) 
from  the  respective  25  horizontal  components  followed  by  stacking. 

Figure  5  . . . . . ; . . . v 

(Top)  receiver  function  produced  by  simultaneous  time-domain 
deconvolution  of  25  events  recorded  at  PFO.  (Bottom)  receiver  function  was 
computed  by  stacking  25  receiver  functions  computed  individually  from  the 
same  recordings  by  spectral  division.  Each  of  these  traces  was  normalized 
by  their  peak  amplitude. 

Figure  6  . . — - ■;••• 

Receiver  functions  computed  by  simultaneous  lower-bounded  deconvolution 
of  the  25  synthetic  seismograms  used  in  computing  Figs.  3  and  4.  The  top 
receiver  function  has  a  lower  bound  of  zero.  The  lower-bound  constraint 
decreases  from  top  to  bottom.  The  bottom  receiver  function  has  no  lower- 
bound  constraint.  Numbers  above  the  traces  are  the  rms  misfits. 
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Figure  7  . . . ............. 

Receiver  functions  computed  by  simultaneous  lower-bounded  deconvolution 
of  the  25  synthetic  seismograms  used  in  computing  Fig.  5.  The  top  receiver 
function  has  a  lower  bound  of  zero.  The  lower-bound  constraint  decreases 
from  top  to  bottom.  The  bottom  receiver  function  has  no  lower  bound 
constraint.  Numbers  above  the  traces  are  the  rms  misfits. 
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The  two  distinct  parts  to  this  dissertation  deal  with  isolating  the  effect  of  structure 
immediately  beneath  seismic  stations  on  seismic  waveforms.  The  first  part  deals  with 
the  inference  of  velocity  discontinuities  beneath  a  single  broadband  3-component 
seismic  station  using  receiver  functions,  in  which  the  P-to-S  converted  phases 
generated  beneath  a  seismic  station  are  isolated  by  deconvolution  of  the  horizontal 
component  seismograms  by  the  vertical. 

We  improve  the  deconvolution  itself  by  the  development  of  a  time  domain 
inversion  for  the  receiver  function.  We  extend  the  technique’s  application  to  an  area  of 
complex  structure,  using  data  collected  at  Pinon  Flat  Observatory,  California,  where 
we  improve  understanding  of  the  region’s  structure  and  tectonic  framework.  We  are 
able  to  make  inferences  about  complex  Moho  topography  and  corroborate  them  using 
observations  of  P-PmP  differential  times,  and  P-wave  polarizations.  We  also  analyze 
the  uncertainties  in  receiver  function  waveforms  using  both  synthetic  and 
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real  data,  demonstrating  that  regularized  deconvolution  of  noisy  seismograms 
significantly  biases  receiver  function  amplitudes. 

The  second  part  of  the  dissertation  deals  with  the  effect  of  near  receiver  velocity 
structure  on  seismic  wave  amplitudes.  We  use  the  near-receiver-scattered  component 
of  teleseismic  coda  to  calibrate  site  amplifications  for  the  southern  California  seismic 
network.  This  required  the  development  of  a  technique  to  separate  the  near-receiver- 
scattered  component  of  teleseismic  coda  from  the  near-source-scattered  component. 
We  also  developed  and  applied  appropriate  statistical  analysis  tools  to  permit  accurate 
estimation  of  the  site  amplifications  from  the  doubly  censored  non-Gaussian  data. 
Specifically,  we  use  maximum  likelihood  estimation  to  incorporate  censored  data,  and 
the  robust  statistical  technique  of  iteratively  reweighting  the  inversion  based  on  the 
misfit  to  reduce  biasing  of  parameter  estimates  by  outliers. 

The  main  purpose  of  estimating  site  amplifications  has  been  to  enable  isolation  of 
propagation  effects  on  Lg  amplitudes,  which  is  important  to  understand  for  accurate 
monitoring  of  nuclear  testing.  We  conclude  with  the  application  of  the  site 
amplifications  to  Lg  of  regional  events,  demonstrating  that  they  are  successful  in 
isolating  propagation  from  site  effects  on  Lg  amplitudes. 


Chapter  1 


Introduction  to  the  dissertation 


Goals  and  Research  Accomplished 

A  seismogram  contains  information  about  the  source  of  the  energy  and  the  structure  of  the  Earth 
between  the  source  and  the  receiver.  Increased  understanding  of  seismic  sources  and  earth  structure  is 
gained  by  studies  designed  to  isolate  information  from  just  the  source  or  from  just  one  small  portion  of 
the  path.  There  are  two  distinct  parts  to  this  thesis,  both  of  which  are  designed  to  isolate  the  effect  of 
structure  immediately  beneath  the  station  at  which  seismograms  are  recorded.  A  significant  portion  of 
the  work  in  both  parts  involves  the  development  of  processing  and  analysis  tools,  a  useful  end  in  itself. 

The  first  part  of  the  thesis  deals  with  the  inference  of  velocity  structure  beneath  a  single  broadband 
3-component  seismic  station  by  receiver  function  analysis.  Complexity  in  teleseismic  P  waveforms  due 
to  the  source  and  to  path  effects  distant  from  the  station  are  exclusively  associated  with  P-wave  energy, 
due  to  the  much  slower  propagation  velocity,  and  so  later  arrival,  of  S-waves.  The  P- waves  are  recorded 
well  on  both  vertical  and  horizontal  seismograms.  The  only  shear  waves  in  the  early  part  of  teleseismic 
records  must  have  been  generated  by  P-to-S  conversion  just  beneath  the  station,  and  are  recorded 
effectively  only  on  the  horizontal  components.  The  receiver  functions  are  computed  by  the 
deconvolution  of  the  horizontal  seismograms  by  the  corresponding  vertical  seismogram.  This  removes 
the  source  and  distant  path  complexity  from  the  records,  leaving  a  time  series  that  represents  the  P-to-S 
converted  phases  generated  beneath  a  recording  station.  The  receiver  function  is  then  interpreted  in 
terms  of  velocity  structure  beneath  the  station. 

Our  initial  contribution  to  this  area  is  the  improvement  of  the  deconvolution  itself,  by  the 
development  of  a  time  domain  inversion  for  the  receiver  function,  discussed  in  appendix  1.  The 
application  of  the  technique  to  an  area  of  complex  3-dimensional  structure  has  had  two  important 
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results,  described  in  chapter  2.  We  extended  the  technique’s  application  to  areas  of  complex  structure, 
especially  to  making  inferences  about  complex  Moho  topography,  by  incorporating  other  observations 
of  the  same  teleseismic  body  waves,  such  as  P-PmP  differential  times,  and  P-wave  polarizations.  We 
also  improved  understanding  of  the  region’s  structure  and  tectonic  framework  through  the  consideration 
of  our  observations  in  light  of  other  pertinent  geophysical  studies  in  the  region.  In  chapter  3,  we  provide 
a  very  thorough  analysis  of  the  uncertainties  in  receiver  function  waveforms.  Until  now,  studies  of 
uncertainty  in  receiver  function  analysis  have  used  only  noise  free  synthetics,  and  uncertainty  bounds 
have  been  invariably  underestimated  by  consideration  of  only  statistical  error.  We  analyze  the  effect  of 
noise  on  receiver  function  waveforms,  using  both  synthetic  and  real  data,  and  demonstrate  that 
regularized  deconvolution  of  noisy  seismograms  significantly  biases  receiver  function  amplitudes.  We 
also  examine  common,  yet  heretofore  overlooked  sources  of  error  in  receiver  function  interpretation 
due  to  errors  in  commonly  made  physical  assumptions. 

The  second  part  of  the  thesis  also  deals  with  the  isolation  of  the  effects  of  near  receiver  velocity 
structure  on  seismic  signals,  but  for  regional  data  at  higher  frequency.  Whereas  part  one  was  driven 
largely  by  the  desire  to  understand  the  earth  better,  this  part  of  the  thesis  also  has  an  immediate  and 
quite  important  practical  application.  That  is,  understanding  regional  propagation  is  important  to  the 
verification  of  a  global  nuclear  test  ban  treaty,  which  is  discussed  in  chapter  4.  We  focus  largely  on 
understanding  variations  in  amplitude  of  the  Lg  phase,  so  that  we  can  predict  them  elsewhere.  Lg  is 
largely  composed  of  shear-wave  energy  traveling  in  the  crustal  waveguide,  or  equivalently  higher  mode 
surface  waves,  and  figures  prominently  in  many  regional  seismic  discriminants.  The  problem  is 
complicated,  and  the  work  presented  here  lays  the  foundation  for  answering  some  fundamental 
questions.  The  basic  scientific  goal  is  to  understand  better  the  physics  of  blockage  and  attenuation  of 
Lg,  which  can  occur  over  just  15  to  20  km  of  propagation.  The  long  term  practical  goal  of  the  work  is  to 
enable  prediction  of  variations  in  regional  phases  important  to  discrimination.  Ideally,  such  prediction 
would  be  based  on  other  globally  available  geophysical  parameters,  such  as  topography  and  gravity, 
from  which  crustal  thickness  may  be  inferred. 
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Improved  understanding  of  Lg  propagation  has  been  held  back  by  two  limitations  in  the  data 
available.  One  is  that  Lg  amplitude  varies  dramatically  over  very  short  length  scales  along  paths 
traversing  various  types  of  structure,  but  dense  spatial  coverage  along  a  variety  of  structures  is  rarely 
available.  That  limitation  is  overcome  by  using  data  from  the  southern  California  seismic  network 
(SCSN)  which  consists  of  hundreds  of  stations  over  a  large  area  of  tremendously  variable  velocity 
structure.  The  second  limitation  is  that  site  amplifications  are  rarely  known  at  any  recording  sites,  so  it 
is  impossible  to  compare  absolute  amplitudes  of  seismic  energy  from  one  station  to  the  next  and 
attribute  the  difference  to  path  effects  between  the  sites.  Much  of  the  rest  of  the  thesis  deals  with 
overcoming  this  limitation.  So  although  the  purpose  of  this  work  is  to  understand  regional  propagation 
by  isolating  the  effects  of  short  path  segments  on  amplitudes  of  specific  phases,  it  is  necessary  to  first 
isolate  and  quantify  the  effect  of  the  site  structure. 

The  near-receiver-scattered  component  of  teleseismic  coda  is  used  to  calibrate  site  amplifications. 
The  data  are  doubly  censored,  and  contain  significant  large  errors  and  non-Gaussian  noise,  so  it  is 
necessary  to  first  develop  appropriate  statistical  analysis  tools,  which  we  discuss  in  chapter  5.  In  chapter 
6  we  present  the  development,  application,  and  testing  of  a  technique  to  separate  the  near-receiver- 
scattered  component  of  teleseismic  coda  from  the  near-source-scattered  component.  We  then  examine 
the  suitability  of  near-receiver-scattered  coda  as  an  isotropic  source  of  Lg-like  energy..  Finally,  in 
Chapter  7  we  demonstrate  an  application  of  the  site  amplifications  to  Lg  of  regional  events,  and  discuss 


the  continuation  of  this  work. 


Chapter  2 


Constraints  on  crustal  structure  and  complex  Moho  topography  beneath  Pinon  Flat,  California, 

from  teleseismic  receiver  functions 

Abstract 

We  use  teleseismic  P-waves  recorded  at  Pinon  Flat  Observatory  (PFO)  to  constrain  the  3- 
dimensional  crustal  and  upper-mantle  velocity  structure  beneath  the  station.  By  forward  modeling  radial 
receiver  function  waveforms  we  construct  a  1-dimensional  crustal  model  which  includes  a  significant 
shear-velocity  inversion  at  9  km  depth.  Arrivals  on  the  tangential  components  indicate  dip  of  at  least  the 
uppermost  discontinuity.  Complicated  Moho  topography,  deepening  to  the  northwest  of  PFO,  is  sug¬ 
gested  by  azimuthal  dependence  of  travel-times  and  amplitudes  of  the  receiver  functions  and  travel  times 
of  crustal  P-wave  reverberations.  Although  fine  details  cannot  be  resolved,  each  of  those  sets  of  obser¬ 
vations  plus  mislocation  vectors  provide  strong  indications  of  abrupt  Moho  topography,  possibly 
including  step  offsets  of  several  kilometers.  This  is  not  only  consistent  with  gravity  data  in  implying  Airy 
isostasy  with  compensation  at  Moho  depth,  but  extends  that  model  to  a  much  finer  length  scale  than  had 
been  resolved. 


Introduction 

Pinon  Flat  Observatory  (PFO)  is  located  in  the  tectonically  active,  structurally  complex  transitional 
area  between  the  high  elevation  San  Jacinto  Mountains  and  the  below  sea-level  Salton  Trough  (figure  1). 
The  trough,  characterized  by  high  heat  flow  and  thin  crust  associated  with  active  rifting  (e.g..  Elders  et  al, 
1972;  Fuis  et  al.,  1984),  is  the  northern  continental  extent  of  the  spreading  center  extending  throughout 
the  Gulf  of  California.  PFO,  at  1288  m  elevation  on  the  southeast  edge  of  the  San  Jacinto  massif,  is  30  km 
southeast  of  the  3302  m  San  Jacinto  Peak,  and  10  km  southwest  of  the  near  sea  level  edge  of  the  trough. 
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Figure  2.1:  Shaded  relief  map  of  the  area  around  PFO,  CA.  The  San  Jacinto  and  San  Andreas  faults 
are  shown  to  the  southwest  and  northeast  of  PFO  respectively.  Some  smaller  cross-faults  near  PFO, 
referred  to  in  the  text,  are  included  (from  Rogers,  1965).  The  circle  about  PFO  is  of  10  km  radius. 
The  sea  level  contour  is  indicated  by  a  dashed  line. 
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Resting  on  a  broad  flat  area  of  the  Mesozoic  granitic  rock  that  forms  the  bulk  of  the  Peninsular  Ranges, 
PFO  is  surrounded  by  fault-co6ntrolled  contacts  with  older  metamorphic  rocks  (Rogers,  1965)  and  is 
located  between  the  San  Jacinto  fault  zone  12  km  to  the  southeast  and  the  San  Andreas  fault  zone  25  km 
to  the  northwest.  Teleseismic  waves  of  events  arriving  at  this  single  broad-band  station  from  a  range  of 
azimuths  sample  the  varying  crust  and  upper  mantle  of  this  complex  region.  We  find  that  with  data  from 
a  range  of  azimuths,  the  information  in  receiver  functions  can  provide  strong  constraints  on  the  structure 

of  such  a  complex  region. 


Tectonic  and  Geological  Setting 

We  review  briefly  the  area’s  geology,  from  the  surface  downward.  Granitic  bedrock  overlain  by  a 
thin  layer  of  weathered  granite  forms  the  surface  at  PFO.  Near  surface  P  and  S  velocities  of  5.4  and  3.0 
km/sec  respectively  were  obtained  at  this  site  from  a  300  meter  deep  borehole  log  (Fletcher,  et  al.,  1990). 
The  base  of  the  granitic  portion  of  the  batholith  is  estimated,  based  on  aeromagnetic  data,  to  be  at  10  to 
12  km  depth  (Jachens,  1991).  A  travel  time  tomography  study  over  a  local  network,  with  its  easternmost 
station  at  PFO,  finds  simple,  smoothly  varying  velocity  structure  with  no  need  for  velocity  inversions  or 
changes  in  the  Poisson  ratio  (Scott,  1992,  Scott  et  al.,  1994).  Its  best  resolution  is  between  9  and  16  km 
depth,  where  the  P/S  velocity  ratio  averages  1.71,  significantly  less  than  the  surface  value  of  1.8. 

A  pervasive  zone  of  low  resistivity  with  its  top  at  10  km  depth  was  imaged  by  a  magnetotelluric 
traverse  of  the  southern  Peninsular  range,  passing  about  90  kilometers  south  of  PFO  (Park  et  al.,  1992). 
It  is  attributed  to  saline  fluids  trapped  in  their  upward  migration  at  an  impermeable  boundary. 

Gravity  data  (Jachens  and  Griscom,  1985)  indicate  that  the  region  is  roughly  in  isostatic  equilibrium 
for  an  Airy  model.  For  compensation  at  the  Moho,  Moho  topography  would  roughly  mirror  surface  to¬ 
pography  multiplied  by  some  scale  factor.  That  and  the  region’s  precipitous  surface  topography  (figure 
1),  predict  extreme  Moho  topography.  For  example,  for  such  isostasy  to  hold,  San  Jacinto  Peak,  30  km 
northwest  of  PFO  would  require  a  crustal  root  9  km  thicker  than  that  beneath  PFO  (equivalent  to  a  17° 
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Moho  dip).  That  seems  to  presents  an  contradiction  with  an  apparently  strong  elastic  crust.  Seismicity  of 
magnitude  greater  than  2  on  the  San  Jacinto  and  San  Andreas  Faults  near  PFO,  to  depths  of  20  and  22  km 
respectively  (Sanders,  1990),  suggests  that  the  lithosphere  has  some  strength  to  at  least  those  depths.  The 
flexural  rigidity  of  20  km  thick,  unbroken  elastic  lithosphere  should  virtually  fully  support  topography 
that  is  much  less  than  1000  km  wavelength  (e.g..  Turcotte  and  Schubert,  1982),  and  so  San  Jacinto  Peak 
should  be  supported  flexurally.  To  understand  this  apparent  contradiction  requires  consideration  of  the 
region’s  tectonic  history. 

PFO  is  located  within  the  eastern  margin  of  the  Cenozoic  Great  Basin  extension,  where  much  of  the 
extension,  occurring  between  6  and  13  m.y.  ago  (Ekren  et  al.,  1968,  Anderson  et  al.,  1972,  Eberly  and 
Stanley,  1978)  was  concentrated  (Proffett,  1977).  It  is  also  near  the  San  Andreas  fault,  which  formed 
between  5.5  and  7  million  years  ago,  when  the  transform  boundary  between  the  Pacific  Plate  and  the 
North  American  Plate  "jumped"  inland,  to  meet  the  north  end  of  the  spreading  Gulf  of  California  (At¬ 
water,  1970,  Curray  and  Moore,  1984).  The  transform  boundary  may  have  even  jumped  to  its  present 
position  due  to  the  weakness  of  the  lithosphere  there  relative  to  that  at  its  previous  position  outside  the 
range  of  extension.  There  is  general  agreement  that  prior  to  the  uplift  of  the  Peninsular  Ranges  there  was 
a  flexurally  repressed  crustal  root  (e.g.  O’Connor  and  Chase,  1989).  Uplift  may  have  been  initiated  when 
the  San  Andreas  fault  shifted  to  its  current  position,  fracturing  the  crust  and  releasing  the  restrained 
buoyancy  forces  (O’Connor  and  Chase,  1989),  or  earlier  during  the  late  Cenozoic  extension,  (Dokka  and 
Merriam,  1982;  Stock  and  Hodges,  1990).  Regardless  of  timing,  initiation  of  uplift  depended  on  the  crust 
being  fractured  throughout  its  elastic  portion,  permitting  offsets  at  the  Moho.  The  stress  regime  at  that 
time,  that  permitted  isostatic  balance  to  be  reached,  was  not  necessarily  the  same  as  today’s. 

We  have  mentioned  a  simple  model  of  Airy  isostasy  with  compensation  at  the  Moho,  which  is  not 
necessarily  the  only  possible  explanation  for  the  gravity  data.  Pratt  isostasy  may  be  indicated  by  varia¬ 
tions  in  upper  mantle  velocities  (e.g.  Hearn  and  Clayton,  1986,  Sung  and  Jackson,  1992),  although  those 
heterogeneities  were  observed  on  much  greater  scale  lengths  than  considered  here.  High  density  crustal 
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basement,  suggested  for  the  southern  Salton  trough  based  on  gravity  data  (Fuis  et  al.,  1984),  is  another 
possible  alternate  explanation. 

Deeper  structure  was  probed  by  a  refraction  experiment  (Benz  and  McCarthy,  1994),  from  which 
was  inferred  an  upper  mantle  LVZ  at  40  to  55  km  depth  throughout  the  Basin  and  Range  -  Colorado 
Plateau  transition.  Walck  (1984)  used  array  mislocations  to  infer  the  existence  of  an  east-west  trending 
antiform  with  axial  depth  of  100  km,  70  km  north  of  PFO. 


Data 

We  have  analyzed  1 17  high  signal-to-noise  ratio  recordings  of  teleseismic  P- waves  recorded  at  PFO 
over  a  thirty  month  period.  These  events  are  well  distributed  in  azimuth  and  distance  (figure  2).  We 
analyzed  these  data  using  the  receiver  function  technique  (e.g.  Langston,  1979;  Owens  et  al.,  1984).  For 
horizontally  layered  structure,  receiver  functions  are  ideally  a  series  of  spikes  in  which  each  arrival  rep¬ 
resents  a  P-to-S  converted  phase  or  some  multiply  reflected  phase  beneath  the  receiver  that  ends  in  an 
S-wave  leg.  In  the  case  of  dipping  interfaces,  receiver  function  arrivals  may  also  represent  P  multiples. 
The  receiver  functions  are  estimated  by  deconvolution  of  the  radial  and  transverse  components  of  the 
seismogram  by  the  vertical.  The  advantage  of  using  receiver  functions  instead  of  original  seismograms  is 
that  source  and  path  complexities,  which  are  present  in  all  components  of  the  seismograms,  are  in  prin¬ 
ciple  removed  by  the  deconvolution,  isolating  the  local  earth  response  in  the  receiver  functions.  From 
receiver  functions,  the  existence  of  discontinuities  beneath  a  station,  and  their  approximate  depths  and 
velocity  contrasts  may  be  inferred.  The  receiver  functions  used  here  for  waveform  modeling  and  analysis 
of  Moho  topography  were  calculated  using  a  simultaneous  time  domain  deconvolution  of  records  from 
multiple  events  from  the  same  source  region  (Gurrola  et  al.,  1995). 

We  interpret  the  waveforms  of  the  receiver  functions  using  a  simple  forward  modeling  approach,  the 
result  of  which  is  then  modified  based  on  independent  information  regarding  the  local  structure.  For 
further  insight  into  the  apparently  complicated  Moho  topography,  we  have  incorporated  other 
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Figure  2.2:  Station-centered  plot  of  1 17  events  at  PFO  by  ray  parameter  and  backazimuth.  Circles,  from 
the  outermost  inward,  are  at  0.08,  0.06.  and  0.04  sec/km.  Nearby  events  have  been  grouped  according 
to  ray  parameter  and  backazimuth  (the  events  in  each  group  are  encircled).  The  numbers  correspond  to 
the  receiver  functions  in  figure  8.  Isolated  events  are  used  in  the  individual  analyses  discussed  in  the  text. 
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observations,  specifically,  the  azimuthal  dependence  of  teleseismic  P-wave  mislocation  vectors  and 


crustal  P-wave  reverberation  travel  times. 


Forward  Modeling 

Radial  PFO  receiver  functions  do  not  vary  with  backazimuth  for  the  first  3.4  seconds  and  can  there¬ 
fore  be  modeled  using  homogeneous  horizontal  layers,  but  the  timing  of  the  arrival  interpreted  as  the 
Moho  Ps  phase  does  vary  systematically  with  backazimuth,  from  3.4  to  4.2  seconds  after  the  initial  arrival 
(for  simplicity,  we  will  refer  to  the  initial  peak  as  P).  We  have  used  events  from  all  azimuths,  binned  only 
by  ray  parameter,  to  calculate  the  receiver  functions  used  for  plane-layered  crustal  modeling.  Receiver 
functions  calculated  for  the  two  smallest  ray  parameter  bins  are  the  least  complicated  (figure  3).  Pre¬ 
sumably  this  is  because  the  ray  paths  for  these  waves  are  steepest  and  so  least  sensitive  to  lateral 
heterogeneities  in  the  crust.  We  model  these  two  traces  to  obtain  a  one  dimensional  crustal  model. 

Receiver  functions  contain  information  about  the  existence  of  discrete  velocity  discontinuities  but 
are  relatively  insensitive  to  smooth  velocity  variations  between  these  discontinuities.  To  minimize  com 
plexity  in  our  forward  modeling  in  a  manner  consistent  with  the  type  of  information  available  in  the  data, 
we  attempt  to  construct  a  model  with  as  few  discontinuities  as  possible.  This  contrasts  with  a  common 
approach  in  recent  receiver  function  inversion  studies  (e.g..  Ammon  et  al.,  1990),  which  have  focussed  on 
constructing  the  smoothest  model  with  many  thin  layers.  Since  the  features  we  are  trying  to  fit  are  few 
and  simple,  fitting  them  by  forward  modeling  proves  to  be  straightforward  and  provides  some  insight  into 
the  trade-offs  involved. 

Our  systematic  approach  is  as  follows.  We  choose  P  and  S  velocities  for  the  uppermost  layer  and 
maintain  the  same  P-to-S  velocity  ratio  throughout  the  crust.  The  first  arrival  after  P  is  modeled  as  the  Ps 
converted  phase  from  a  velocity  discontinuity.  The  depth  of  the  discontinuity  is  chosen  to  match  the 
timing  and  the  magnitude  of  the  velocity  jump  is  chosen  to  match  the  amplitude.  Comparisons  are  made 
using  ray  synthetics  (Langston,  1977)  convolved  with  a  0.2  second  half-width  Gaussian  function  to 
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Receiver  Functions  for  Events  Grouped  by  Distance 


Figure  2.3:  Receiver  functions  calculated  for  events  binned  by  ray  parameter  only.  The  average 
distance  in  degrees  for  each  set  of  events  is  to  the  left  of  each  trace. 
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approximate  the  frequency  content  of  the  data.  The  next  arrival  in  the  data  that  cannot  be  interpreted  as  a 
reverberation  off  the  first  discontinuity  is  modeled  as  the  Ps  phase  from  a  second  discontinuity.  The 
timing  of  receiver  function  arrivals  is  a  robust  measurement,  and  so  depth,  given  some  velocity,  will  be 
well  constrained.  Receiver  function  amplitudes,  on  the  other  hand,  are  very  sensitive  to  noise,  errors  in 
physical  assumptions  (e.g.  direction  of  arrival  of  rays,  dip  of  layers),  and  the  specifics  of  the  deconvolu¬ 
tion  technique  (Baker,  et  al.,  1996),  and  so  the  amplitude  of  velocity  jumps  are  very  poorly  constrained. 

We  start  with  the  P  and  S  velocities  of  5.4  and  3.0  km/sec  respectively  for  the  uppermost  layer, 
obtained  from  the  300  meter  deep  borehole  PFO  (Fletcher,  et  al.,  1990).  The  first  arrival  after  P  appears 
as  a  shoulder  on  the  right  side  of  the  main  peak  at  0.6  seconds.  It  is  modeled  as  the  Ps  phase  of  a  discon¬ 
tinuity  at  3.4  km  depth.  The  model  and  fit  to  the  data  are  shown  in  the  top  row  of  figure  4.  The  data  shown 
is  from  the  steepest  incidence  group  of  figure  3.  The  PpPjS  and  PpSjS  phases  (labeled  in  figure  4,  top)  in 
the  synthetics  also  match  arrivals  in  the  data,  supporting  both  our  initial  assumption  that  the  first  arrival 
after  P  is  a  Ps  phase  and  our  choice  of  Poisson’s  ratio.  PpPiS  describes  the  ray  that  reflects  as  P-wave  off 
the  free  surface  and  then  is  converted  to  S  upon  reflection  upward  at  the  i’th  discontinuity.  PpSjS  is 
similar,  but  with  the  conversion  occurring  at  the  free  surface  reflection.  The  arrivals  in  the  data  would  not 
have  been  matched  as  reverberations  off  a  single  discontinuity  if  we  had  assumed  a  Poisson  solid.  Al¬ 
though  reverberations  within  a  low- velocity  surface  layer  could  provide  an  alternative  explanation  of  the 
early  arrivals,  the  high  velocities  in  the  borehole  and  the  granitic  geology  of  the  site  virtually  rule  out  the 
existence  of  such  a  layer. 

The  first  feature  in  the  data  not  accounted  for  by  the  first  interface  (figure  4,  row  1)  is  the  large 
negative  trough  at  1.0  seconds,  which  we  interpret  as  the  Ps  arrival  from  the  top  of  a  low- velocity  zone 
(LVZ).  A  low-velocity  surface  layer  would  again  have  been  the  only  simple  structure  that  could  provide 
an  alternative  explanation  for  the  arrival.  We  add  a  velocity  inversion  with  a  0.7  km/sec  drop  in  S-wave 
velocity  in  the  second  model  (figure  4,  row  2)  at  9.2  km  depth.  A  velocity  increase  at  17.2  km  depth 
produces  a  Ps  arrival  at  2.6  seconds  to  match  the  peak  in  the  data  at  that  time  (figure  4,  row  3).  The  Moho 
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Figure  2.4:  Incremental  development  of  a  simple  velocity  model.  On  the  right  side,  the  receiver  function 
from  the  steepest  incidence  angle  group  of  figure  3  is  plotted  as  a  solid  line  and  the  synthetic  receiver 
function  corresponding  to  the  velocity  model  shown  to  the  left  is  plotted  as  a  dashed  line.  Phases 
indicated  in  the  synthetics  are  due  to  the  newest  feature  of  each  successive  model. 
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Ps  arrivals  in  these  "all  azimuth"  receiver  functions  are  broad,  a  result  of  simultaneously  deconvolving 
events  from  various  backazimuths,  whose  Moho  Ps  phase  varies  in  time.  For  now  we  put  the  fourth  and 
final  discontinuity  into  our  simple  model  at  25.2  km  (figure  4,  row  4).  For  the  earlier  arrivals,  variations 
between  the  synthetic  receiver  functions  and  the  data  are  less  than  those  between  different  groupings  of 
events  (figures  3,  9),  suggesting  that  we  have  fit  the  data  as  well  as  the  noise  allows. 

This  simple  modeling  illustrates  just  what  we  can  learn  from  receiver  functions  alone  (nearly  alone, 
a  priori  knowledge  of  the  uppermost  velocity  was  helpful).  That  is,  we  can  identify  3  crustal  discontinu¬ 
ities,  one  being  a  velocity  inversion,  and  the  Moho,  and  place  them  accurately  on  a  depth-velocity  curve, 
and  we  are  confident  of  the  Poisson  ratio  in  the  uppermost  crust  because  of  the  identification  of  the  crustal 
multiples.  We  can  now  combine  this  new  knowledge  with  previous,  independently  determined  con- 
straints  to  produce  a  better  image  of  earth  structure. 

Because  of  significant  uncertainty  in  receiver  function  amplitudes  (Baker,  et  al.,  1996),  and  con¬ 
comitant  uncertainty  in  the  magnitude  of  the  velocity  jumps  and  layer  velocities,  as  well  as  our  lack  of 
constraint  on  smooth  velocity  gradients  between  discontinuities  and  variation  of  Poisson’s  ratio,  we  do 
not  expect  the  thicknesses  of  the  simple  model  layers  to  be  accurate.  We  incorporate  independent  infor¬ 
mation  regarding  average  slownesses  from  the  travel  time  tomography  to  help  to  constrain  layer 
thicknesses  (Scott,  1992,  Scott,  et  al.,  1994).  Near  surface  velocities  are  poorly  constrained  by  the  to¬ 
mography  study,  due  to  minimal  ray  coverage.  Resolution  of  velocities  is  best  between  9  and  16  km 
depth,  where  the  P/S  velocity  ratio  averages  1.708  (Vp=6.12,  Vs=3.58)  compared  to  the  surface  value  of 
1.8  we  have  retained  throughout.  We  modify  the  model  by  using  the  average  P/S  velocity  ratio  from  the 
local  travel  time  tomography  study  throughout  the  crust,  except  in  the  topmost  layer  where  we  retain  the 
borehole  value  of  1.8,  and  in  the  narrow  LVZ  from  9  to  12  km  depth.  We  also  adopt  the  absolute  P  and 
S-wave  velocities  found  in  the  tomographic  study  for  the  best  constrained  depth  section  covered  (9  to  16 
km),  again  with  the  exception  of  the  LVZ.  With  these  changes,  the  Moho  depth  of  the  model  drops  to 
nearly  30  km,  bringing  it  in  line  with  results  of  previous  investigations,  albeit,  of  a  more  regional  scale 
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(Hearn  and  Clayton,  1986;  Sung  and  Jackson,  1992).  An  8  km  thick  LVZ  would  have  been  resolved  in 
the  travel-time  tomography  (Scott,  1992,  Scott  et  al.,  1994).  For  consistency  with  the  tomography,  the 
LVZ  is  thinned  and  given  a  gradational  base  (so  it  produces  no  large  Ps  phase).  A  shear-wave  velocity 
discontinuity  alone  can  produce  the  trough  at  1.2  seconds  to  match  the  data.  We  show  (figure  5,  right)  the 
extreme  case  of  no  P  wave  LVZ.  In  figure  6  we  see  that  synthetic  receiver  functions  based  on  this  modi¬ 
fied  model  fit  the  data  as  well  as  the  simple  one. 

Because  of  the  refined  model’s  better  agreement  with  the  results  of  travel  time  tomography,  with 
regard  to  P  and  S  velocities  and  the  thickness  of  the  LVZ,  and  its  better  agreement  with  the  Moho  depth 
of  more  regional  studies,  we  consider  it  a  better  approximation  of  true  earth  structure. 


Complex  Moho  Topography  near  the  San  Jacinto  Mt/Salton  Trough  Transition 

There  is  significant  azimuthal  variation  of  both  travel  times  and  amplitudes  of  the  Moho  Ps  phase 
relative  to  P  on  the  receiver  functions  computed  for  PFO.  Unfortunately,  the  trade-offs  between  parame¬ 
ters  that  affect  receiver  function  amplitudes,  discussed  in  the  previous  section,  become  even  more 
complicated  as  many  combinations  of  parameters  varying  with  azimuth  might  explain  the  observations. 
In  addition,  our  implicit  assumption  that  observed  variations  are  due  to  lateral  fluctuations  of  properties 
at  the  interface  where  the  phase  is  generated  may  not  hold.  For  example,  surface  scattered  energy  from  a 
particular  azimuth  could  arrive  simultaneously  with  the  Moho  Ps  phase,  changing  the  apparent  timing  and 
amplitude.  To  reduce  these  ambiguities,  we  have  also  examined  the  azimuthal  variation  of  event  mislo- 
cation  vectors  and  crustal  P-wave  reverberation  travel  times. 

Each  type  of  observation  we  make  samples  different  portions  of  the  crust  and  is  sensitive  to  param¬ 
eters  in  different  ways.  The  travel  time  and  amplitude  of  the  Moho  Ps  phase  relative  to  P  are  sensitive  to 
the  amount  and  direction  of  dip  where  the  direct  P  and  the  Ps  phases  cross  the  Moho.  The  travel  time  is 
also  sensitive  to  velocity  variations  above  the  Moho.  The  amplitude  is  sensitive  to  variations  in  the  ve¬ 
locity  jump.  Which  portion  of  the  crust  is  sampled  by  the  P  and  Ps  phases  is  strongly  dependent  on 
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Figure  2.5:  Two  velocity  models  which  fit  the  data  equally  well  based  on  forward  modeling. 


Figure2.6:  Synthetic  receiver  functions  for  the  simple  (dotted)  and  modified  (dashed)  models  compared 
to  the  data  (solid).  Comparisons  are  made  for  the  steepest  (bottom)  and  second  steepest  incidence  (top) 
group  receiver  functions  of  figure  3. 
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Moho  dip  (figure  7).  In  contrast,  mislocation  vectors  are  sensitive  to  lateral  velocity  variations  along 
the  entire  ray  path,  although  a  steeply  dipping  Moho  would  cause  significant  mislocation.  The  region 
affecting  direct  P-waves  extends  further  from  the  station  than  that  affecting  P-to-S  converted  phases  from 
the  Moho.  The  multiply  reverberating  PpPmp  phase  samples  an  even  greater  lateral  segment  of  the  crust 
(figure  7).  When  we  compare  inferences  we  make  from  each  type  of  observation,  we  are  assuming  that, 
whatever  the  character  of  Moho  variations,  they  will  be  somewhat  consistent  in  their  nature  over  the 
length  scales  we  are  sampling.  Gross  inconsistencies  between  the  inferences  made  from  observations  of 
different  phases  would  suggest  rapid  variations  in  Moho  depth  and  dip. 

Figure  8  shows  the  receiver  functions  for  the  individual  back  azimuth  and  ray  parameter  groups 
identified  in  figure  2.  They  are  lowpass  filtered  below  1  hertz  to  demonstrate  that  timing  and  amplitude 
differences  observed  are  not  simply  due  to  varying  frequency  content  of  the  receiver  functions.  The  Moho 
Ps  varies  significantly  in  amplitude  and  time,  from  3.4  to  4.2  seconds  after  P.  The  pattern  of  Moho  Ps-P 
times  measured  for  individual  receiver  functions  (i.e.  calculated  for  individual  events,  not  the  groups)  is 
shown  in  slowness-azimuth  space  in  figure  9  (left).  The  largest  peak  between  2.5  and  4.5  seconds  was 
taken  to  be  the  Moho  Ps  converted  phase.  We  attempt  to  model  it  with  a  dipping  planar  Moho.  The  pattern 
is  matched  well  for  a  32  km  deep  6.3  km/sec  layer  over  a  half  space  dipping  20°  to  the  northwest  (figure 
9,  right).  Varying  the  strike  by  20°  degrades  the  fit  significantly.  To  quantify  better  the  strike  and  dip  that 
provide  the  best  fit  to  the  data,  we  calculated  the  observed-minus-predicted  Moho  Ps-P  times  for  models 
covering  a  range  of  values  for  strike  and  dip  of  the  interface.  To  avoid  biasing  results  by  high  density  data 
clusters  we  average  the  data  over  bins  of  0.0045  sec/km  in  ray  parameter  and  10°  in  backazimuth.  We 
then  calculated  the  standard  deviation  of  the  residuals  for  all  models  with  5°  incremental  variations  in  dip 
angle  (from  0°  to  30°)  and  direction  and  find  a  well-defined  minimum  at  15°  to  20°  dip  to  the  northwest 
(figure  10).  The  demeaned  residual  pattern  is  shown  for  20°  dip  at  225°  (fig  11,  left),  the  same  rms  error 
as  15°  dip  at  220°.  The  point  where  the  residual  is  plotted  represents  the  horizontal  distance  and  backa¬ 
zimuth  to  the  point  where  the  ray  for  the  Ps  phase  intersects  the  Moho.  For  comparison,  we  plot  the 
demeaned  Ps-P  residuals  relative  to  a  30  km  thick  6.3  km/s  flat  lying  layer  over  a  half  space  (figure  11, 
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Receiver  Functions  Gouped  by  Distance  and  Backazimuth 


Time  (seconds) 


Figure  2.8:  Receiver  functions,  low-pass  filtered  below  1  hertz,  for  the  event  groups  pictured  in  figure 
2,  with  group  numbers  of  figure  2  above  each  trace.  Average  backazimuth  and  distance  for  each  group 
is  to  the  left  of  the  traces.  Note  the  variability  of  the  Moho  Ps-P  time  at  about  4  seconds  (downward 
arrow).  Upward  arrow  indicates  negative  peak  after  Moho  Ps. 
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over  a  half  space  dipping  20  degree: 


Ps  -  F  Residuals 
(Observed  -  Predicted) 

20°  Northwest  Dipping  Layer  Horizontal  Layer 
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deep  horizontal  layer  over  a  half  space  (ngl 
of  the  Moho  are  sampled  in  each  model. 
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right).  Note  that  because  of  refraction  at  the  dipping  interface,  the  Ps  rays  on  the  left  of  figure  1 1  intersect 
the  Moho  "updip"  of  where  they  would  intersect  a  horizontal  Moho  (right  side  of  figure  11).  The  standard 
deviation  of  the  residuals  is  0.25  s  for  the  flat  model  and  0.20  s  for  the  dipping  model. 

There  are  indications  of  deviations  from  planar  structure.  Data  at  steep  incidence  in  the  southwest 
and  northeast  quadrants  show  a  juxtaposition  of  very  different  Ps-P  times  over  very  short  distances  (fig¬ 
ure  1 1).  Although  there  are  few  data  from  the  northeast,  there  is  a  small  consistent  group  of  events  at 
shallow  incidence,  all  with  large  positive  residuals  (greater  Ps-P  times  than  predicted).  Events  at  steeper 
incidence  and  events  at  20°-30°  more  southerly  backazimuths  have  large  negative  residuals.  Another 
possible  juxtaposition  of  large  and  small  residual  times  is  seen  due  south  of  PFO  (in  figure  1 1 ,  left  side) 
about  the  7  km  radius  circle.  These  rapid  spatial  variations  of  times  may  indicate  a  sharp  change  in  depth 
and/or  dip  of  the  Moho.  Comparing  the  13  km  radius  (outer)  circle  of  figure  11  (left  side)  with  that  in 
figure  1 ,  we  see  that  the  apparent  discontinuity  to  the  east  of  PFO  is  near  the  surface  expression  of  the  San 
Jacinto  Mountains  -  Salton  Trough  boundary  (see  also  figure  17).  Although  this  does  not  indicate  the 
position  of  the  batholith  contact  at  depth,  the  mass  above  the  Moho  on  either  side  of  this  boundary  is 
likely  to  vary  strongly.  The  apparently  anomalous  points  to  the  south  of  PFO  (figure  1 1)  coincide  with  the 
steep  1400  meter  north  slope  of  Toro  Peak  (figure  1).  The  Ps-P  time  differences  there  would  be  equiva¬ 
lent  to  variations  of  Moho  depth  of  4  to  5  kilometers  for  flat-lying  models  with  no  lateral  P  velocity 
variations.  Further,  iterating  through  the  same  set  of  models  as  above  for  just  the  steeply  incident  events 
from  the  southwest  through  the  northwest,  for  which  there  is  very  dense  data  coverage,  we  find  that  the 
minimum  Ps-P  time  residual  is  for  a  model  dipping  nearly  due  west.  Given  the  density  of  coverage  and 
the  consistency  of  the  times,  we  think  this  indicates  a  real  deviation  from  planar  structure. 

The  positions  at  which  the  residuals  for  the  dipping  planar  model  are  plotted  vary  greatly  from  those 
of  the  flat  lying  model  (figure  1 1).  This  points  to  a  serious  problem  with  our  approach  thus  far.  Some  of 
these  rays  have  been  refracted  past  the  vertical,  so  as  to  appear  to  arrive  from  the  opposite  backazimuth. 
This  would  lead  to  negative  Ps  arrivals,  which  we  do  not  observe.  This  suggests  that  deeper  Moho  to  the 
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northwest,  still  the  simplest  explanation  for  the  observed  Ps-P  time  pattern  (figures  8-11),  cannot  be  ac¬ 
complished  entirely  by  planar  dip  of  the  Moho.  We  suggest  that  it  is  more  likely  to  be  accomplished  with 
step  offsets,  as  was  also  indicated  by  the  juxtaposition  of  different  residuals,  or  by  some  combination  of 
dip  and  step  offsets.  An  alternate  explanation,  lateral  velocity  variations,  seem  less  plausible.  If  P  and  S 
velocities  varied  together  as  is  usual,  large  changes  in  both  would  be  necessary  to  make  significant 
changes  in  relative  S  and  P  travel  times  through  the  crust.  Even  in  the  improbable  case  of  lateral  hetero¬ 
geneities  occurring  only  in  P  velocities,  an  average  velocity  variation  of  10  percent  over  the  entire  crust 
would  be  required  to  produce  the  same  Ps-P  time  variation  as  a  4  km  increase  in  crustal  thickness. 

Whereas  the  Ps-P  times  depend  on  both  depth  and  dip  of  an  interface,  the  amplitudes  of  P  and  Ps 
phases  are  most  sensitive  to  dip.  The  Ps/P  amplitude  ratio  can  be  a  more  powerful  measure  of  dip  than  the 
amplitudes  are  separately,  as  the  P  and  Ps  amplitudes  vary  oppositely  with  dip  (Owens  et  al.,  1988). 
Because  of  the  trade-off  between  dip  and  size  of  the  velocity  discontinuity  in  predicting  Ps/P  amplitude 
ratios,  as  well  as  the  sensitivity  of  receiver  function  amplitudes  to  noise  and  to  small  changes  m  dip  as 
seen  in  synthetic  tests,  we  can  use  amplitudes  only  to  predict  the  general  direction  and  not  the  steepness 
of  the  dip,  and  then  only  where  we  have  dense  data  coverage. 

Figure  12  shows  the  observed  Moho  Ps/P  amplitude  ratios.  As  amplitude  ratios  of  individual  receiver 
functions  vary  significantly,  we  plot  the  amplitude  ratios  of  the  receiver  functions  computed  for  events 
binned  by  ray  parameter  and  backazimuth  and  low-pass  filtered  (shown  in  figure  8).  In  figure  12  we 
compare  the  observed  to  synthetic  ratios  for  a  20°  northwest  dipping  model.  The  overall  pattern  of 
amplitude  variation  is  similar  to  that  of  the  observations.  The  major  deviation  between  predicted  and 
observed  amplitudes  is  for  events  from  the  southeast,  where  we  see  that,  for  20°  dip,  there  should  be 
almost  no  predicted  Ps,  as  the  incident  rays  would  be  nearly  perpendicular  to  the  plane  of  the 
discontinuity.  A  deviation  from  the  northwest  dipping  planar  model  with  a  gentler  slope  in  the  southeast 
than  in  other  quadrants  would  explain  the  observations.  Further,  in  contradiction  with  the  above  results  in 
which  the  Ps-P  travel  times  for  the  subset  of  events  coming  in  at  steep  incidence  from  the  west  indicated 
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dip  direction.  Plus  symbols  indicate  reversed  polarity  of  the  Ps  arrival. 
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a  westerly  dip,  the  very  distinct  increase  in  the  Ps/P  amplitude  ratio  from  southwest  to  northwest  indicates 
a  northwest  dip.  An  explanation  of  both  the  travel  time  and  amplitude  variations  requires  more 
complicated  structure  than  a  planar  dipping  interface  -  for  example,  downward  step  offsets  in  the  Moho 
from  the  west  to  the  northwest,  while  maintaining  northwest  dip  in  the  individual  segments.  To  sum  up 
thus  far,  the  Moho  deviates  significantly  from  planar  structure  over  length  scales  on  the  order  of  10  km 
or  less.  The  specifics  of  those  deviations  cannot  be  resolved  with  present  data.  Regardless  of  the  second 
order  structure,  the  best  fitting  plane  to  the  Moho  Ps  data  dips  steeply  to  the  northwest.  The  Ps-P  time 
difference  is  equivalent  to  flat-lying  models  of  7  km  difference  in  thickness  . 

The  PpPmp  phase  can  also  be  used  to  constrain  depth  and  dip  of  the  Moho.  To  reduce  the  complexity 
of  the  original  seismograms  due  to  source  and  distant  path  effects,  we  stacked  the  seismograms  of  the 
impulsive  events  within  each  ray  parameter  and  backazimuth  grouping  of  figure  2,  after  aligning  and 
normalizing  the  largest  arrival  of  each  event.  Figure  13  shows  the  individual  vertical  component  records 
(top  five  traces)  for  one  of  the  bins  and  the  stack  of  the  five  records  (bottom  trace).  We  identify  the  only 
remaining  distinct  secondary  arrival  in  the  stack  as  PpPmp.  Similar  processing  of  the  radial  components 
confirms  that  the  arrival  has  the  same  polarity  on  the  radial  component  and.so  is  not  likely  a  Ps  converted 
phase.  Unfortunately,  the  only  impulsive  events  are  from  the  west.  Events  with  eastern  backazimuth  are 
generally  smaller,  mid-Atlantic  ridge  events  with  emergent  arrivals.  Figure  14  shows  the  stacked  arrivals 
for  each  of  four  bins,  and  their  respective  PpPmp-P  times.  We  see  that  the  differential  time  increases  from 
southwest  to  northwest,  consistent  with  deeper  Moho  to  the  northwest.  The  differences  between  PpPmp-P 
times  at  different  azimuths  cannot  be  modeled  using  a  dipping  layer  alone  regardless  of  the  degree  of  dip 
without  severe  revision  of  the  crustal  velocities  we  have  been  using.  The  reason  for  this  is  illustrated  in 
figure  7.  Although  an  increase  in  dip  increases  the  vertical  distance  the  downdip  rays  must  travel,  the 
horizontal  ray  path  distances  decrease.  The  result  is  less  time  variation  than  we  might  at  first  expect, 
although  the  sense  of  time  variation  is  what  we  expect,  with  longer  PpPmp  travel  times  downdip.  An 
alternate  explanation  of  the  data  is,  again,  to  call  on  Moho  step  offsets  as  a  means  of  thickening  the  crust 
to  the  northwest.  If  the  Moho  were  everywhere  horizontal,  with  6.3  km/sec  average  crustal  velocity,  the 
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Individual  and  Stacked  P  Arrivals 


Figure  2.13:  Normalized  P  waves  (vertical  recordings)  of  impulsive  events  from  group  5  of  figure  2 
(top  five  traces),  aligned  by  arrival  time,  and  their  stack  (bottom  trace). 


Figure  2.14:  Stacks  for  the  4  groups  containing  impulsive  events.  PpPmp-P  times  are  listed  above 
the  presumed  PpPmp  arrival  of  each  trace.  The  position  of  each  group  is  indicated,  with  the  circle 
scaled  by  the  residual  relative  to  the  flat-lying  6.3  km/sec  30  km  deep  layer  over  a  half  space  of 
figure  1 1  (so  as  to  account  for  the  effect  of  variations  in  ray  parameter  between  groups). 
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PpPmp-P  times  would  indicate  crustal  thicknesses  of  29, 31,  35,  and  37  km  from  southwest  to  northwest. 

Mislocation  vectors  are  the  difference  between  the  predicted  and  observed  directions  of  arrival  at  a 
station,  plotted  on  a  focal  sphere  (e.g.  Davies,  1973).  The  tangential  components  of  the  mislocation  vec¬ 
tors  are  shown  in  figure  15.  We  used  the  data  variance  tensor  decomposition  (Aster  et  al.,  1990)  to 
measure  the  3-dimensional  particle  motion.  A  horizontal  Moho  would  cause  no  mislocation,  but  dipping 
structure  will  bend  the  rays  so  they  come  in  from  further  updip  than  predicted.  To  avoid  biasing  the  results 
we  used  an  objective  criterion  based  on  the  linearity  of  particle  motion  to  determine  whether  to  include  a 
measurement  and  made  no  modifications  (e.g.  to  the  data  window  used)  once  the  mislocation  was 
calculated.  We  observe  a  consistent  pattern  of  mislocation  vectors.  These  observations  are  matched  by  a 
layer  over  a  half  space  model  dipping  to  the  southwest  (figure  15,  bottom),  perpendicular  to  the  overall 
direction  of  Moho  dip  we  inferred  from  interpretation  of  the  receiver  functions.  The  radial  components  of 
the  mislocation  vectors  are  much  noisier,  but  are  consistent  with  the  southwest  dipping  model.  The  re¬ 
ceiver  function  Moho  Ps,  the  PpPmp,  and  the  mislocation  vector  observations  are  each  internally 
consistent.  All  observations  indicate  deviations  from  planar  Moho  topography.  Using  a  planar  dipping 
interface  model,  the  mislocation  vectors  suggest  dip  of  some  layer  to  the  southwest,  whereas  the  others 
suggest  deeper  Moho  to  the  northwest.  The  pattern  we  observe  in  the  mislocation  vectors  could  be  due  to 
bending  of  the  ray  paths  anywhere  between  the  sources  and  receiver.  The  radial  receiver  function  wave¬ 
forms  indicated  no  significant  planar  dipping  crustal  interfaces,  although  there  is  poor  resolution  of  dip  at 
the  uppermost  interface.  A  study  of  array  mislocations  (Burdick  and  Powell,  1980;  Walck,  1984)  indi¬ 
cates  that  across  southern  California  there  is  no  average  regional  mislocation  observed,  suggesting  that 
the  pattern  at  PFO  is  not  global,  but  must  have  been  produced  locally.  Walck  explains  mislocations  she 
observes  in  the  area  near  PFO  as  being  due  to  a  combination  of  both  crustal  and  deeper  (50-200  km) 
variations  in  velocity  structure.  She  infers  the  existence  of  an  east-west  trending  antiform  with  axial  depth 
of  100  km,  70  km  north  of  PFO.  Rays  from  the  northwest,  but  not  the  southwest,  reaching  stations  at  the 
latitude  of  PFO  are  affected  by  that  antiform.  She  suggests  that  rays  from  the  southeast  are  affected  by 
low- velocity  below  the  Salton  Trough.  We  cannot  distinguish  as  she  does  between  the  shallow  and  deep 
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Tongentiol  Component  of  Mislocotion  Vectors  ot  PFO 


Synthetic  Tangential  Mislocotion  Vectors  for  S40W  Dipping  Layer 


Figure  2.15:  Tangential  component  of  mislocation  vectors  for  initial  P  waves  at  PFO  (top)  and  for  a 
layer  over  a  half  space  dipping  20  degrees  towards  S40W  (bottom).  Circles  are  at  10°,  25°,  and  40° 
incidence.  Predicted  positions  of  the  events  are  plotted  as  disks.  Arrows  point  to  the  measured  positions 
of  the  events.  Eighteen  of  twenty  one  mislocation  vectors  in  the  northwest  quadrant  are  pointing 
clockwise,  with  a  mean  of  nearly  5  degrees,  the  southeast  quadrant  has  a  mean  tangential  mislocation 
of  2.5  degrees  counterclockwise  with  sixteen  of  twenty  one  events  consistent  in  direction,  while  the 
southwest  quadrant  mislocation  vectors  are  much  less  consistent  and  have  a  mean  of  1.5  degrees 
clockwise. 
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effects,  but  it  is  reasonable  to  suggest  that  much  of  the  pattern  of  mislocation  vectors  observed  at  PFO  is 
a  result  of  some  combination  of  such  structures.  The  pressing  question  is  not  what  causes  the  pattern 
observed  at  PFO,  but  why  there  is  no  signature  in  the  mislocation  vectors  at  PFO  from  a  northwest 
dipping  Moho.  For  such  a  structure,  the  greatest  tangential  mislocation  should  be  observed  in  the  north¬ 
east  (where  we  have  almost  no  data)  and  southwest  quadrants.  The  signal  in  the  northwest  and  southeast 
quadrants  should  be  small,  and  so  could  be  overwhelmed  by  the  signal  from  the  antiform  to  the  northwest 
and  low- velocity  in  the  rift  to  the  southeast.  The  data  in  the  southwest  quadrant,  however,  are  inconsistent 
with  the  mislocations  expected  from  a  planar  northwest  dipping  Moho,  and  so  imply  non-planar  Moho 
topography. 

There  is  a  negative  arrival  immediately  following  the  Moho  Ps  arrival,  peaking  at  3.8  to  4.4  seconds 
after  P  (figure  8).  For  some  of  the  receiver  functions  this  phase  is  matched  by  the  PpPjS  from  the  LVZ  of 
the  synthetics  (Figures  4  and  6).  However,  it  also  tracks  the  azimuthal  variation  of  the  Moho  Ps  phase 
(figure  8).  Its  synchronization  with  the  Moho  Ps  phase  suggests  that  the  arrival  may  be  the  Ps  phase  from 
the  base  of  a  high-velocity  lid  at  the  top  of  the  mantle.  Both  phases  could  be  interfering  there,  making 
interpretation  problematic.  Beyond  that  arrival,  consistency  between  receiver  functions  decreases,  prob¬ 
ably  because  of  decreasing  signal  amplitude  for  the  later  arrivals,  interference  between  later  arrivals  and 
reverberations  of  early  arrivals,  greater  sensitivity  of  later  arrivals  to  dip  and  laterally  varying  structure, 
and  the  effect  of  complicated  Moho  topography  on  later  crustal  reverberations. 

Tangential  Component  Receiver  Functions  and  Dipping  Structure 

Arrivals  on  the  tangential  components  of  receiver  functions  represent  energy  coming  in  off  the  pre¬ 
dicted  backazimuth.  We  can  attempt  to  model  them  deterministically  only  if  for  some  azimuths,  they  have 
a  counterpart  on  the  radial  component  receiver  functions  and  if  there  is  continuity  between  receiver 
functions  from  similar  backazimuths  and  ray  parameters. The  extent  of  variation  in  Moho  topography 
inferred  from  the  radial  components  encourages  us  to  expect  a  Moho  Ps  signal  on  the  tangential  compo- 


31 


nents  at  PFO.  Somewhat  surprisingly,  only  a  couple  of  the  groups’  tangential  receiver  functions  have 
arrivals  at  the  same  time  as  the  Moho  Ps  on  the  radials,  and  even  in  those  cases,  there  is  no  consistency 
between  adjacent  groups.  This  does  not  imply  that  there  is  no  Moho  dip,  but  simply  reflects  the  difficulty 
in  interpreting  this  component.  The  lack  of  a  strong  Moho  signal  on  the  tangential  components  is  con¬ 
sistent  with  the  Moho  depth  change  being  accomplished  by  step  offsets  rather  than  by  a  smooth  surface. 

There  are  early  tangential  component  arrivals  with  sufficient  continuity  between  adjacent  groups  to 
warrant  further  scrutiny.  Those  arrivals  appear  to  correspond  to  the  converted  and  reflected  phases  from 
the  uppermost  discontinuity  (figures  4  and  5).  We  note  however  that  interpretation  of  the  radial  receiver 
functions  becomes  much  more  ambiguous  if  we  consider  models  with  dipping  interfaces.  In  these  cases, 
the  receiver  function  arrivals  no  longer  are  restricted  to  representing  phases  that  arrive  at  the  seismometer 
as  S-waves.  In  fact,  the  largest  receiver  function  arrival  after  P  for  a  simple  dipping  layer  over  a  half  space 
can  be  the  first  P  multiple,  PpPip.  Which  phase  is  largest  depends  on  the  angle  between  the  ray  azimuth 
and  the  interface  strike.  We  concentrate  our  attention  on  the  tangential  component  receiver  function  of 
group  5  of  figure  1,  both  because  it  has  the  most  events  and  so  has  very  high  signal-to-noise  ratio,  and 
because  the  tangential  mislocation  vectors  at  that  distance  and  backazimuth  are  nearly  zero  (figure  15). 
Receiver  function  arrivals  from  dipping  interfaces  are  very  sensitive  to  the  angle  through  which  the 
seismograms  are  rotated.  Rotation  errors  of  the  size  observed  could  cause  significant  tangential  compo¬ 
nent  arrivals  to  disappear,  or  even  change  polarity.  Thus  the  mislocation  vectors  we  discussed  earlier 
make  interpretation  of  the  other  groups’  arrivals  more  ambiguous.  Such  a  mislocation  study  should  al¬ 
ways  be  performed  if  tangential  components  are  going  to  be  modeled,  to  ensure  that  arrivals  due  to  energy 
refracted  out  of  the  sagittal  plane  below  the  depth  of  interest  are  not  misinterpreted  in  terms  of  shallower 
dipping  structure.  That  being  done,  the  sensitivity  to  rotation  can  actually  be  used  to  infer  both  the  exist¬ 
ence  and  direction  of  dip. 

The  radial  and  tangential  receiver  functions  of  group  5  (of  figure  1),  and  their  point-by-point  product, 
indicating  the  correlation  at  each  time  point,  are  shown  in  figure  16.  For  this  group,  there  is  good 
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Figure  2.16:  Radial  (solid  line,  upper  trace)  and  tangential  (solid  line,  second  trace)  receiver  functions 
of  group  5  (from  figure  2),  and  their  point-by-point  product,  indicating  their  coherence  (solid  line, 
lower  trace).  Synthetic  receiver  functions  and  their  coherence  (dashed  lines)  are  shown  below  the  data 
traces,  for  the  refined  model  of  figure  5  in  which  the  shallowest  discontinuity  dips  150  southward. 
Vertical  lines  through  the  traces  delineate  the  windows  for  which  particle  motion  is  plotted  below 
(the  upper  row  is  the  for  the  data  and  the  lower  row  is  for  the  synthetics). 
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correlation  for  the  earliest  crustal  phases  as  well  as  the  Moho  Ps,  but  only  the  early  arrivals  appear 
consistently  in  nearby  groups,  and  we  will  concentrate  on  thosephases.  The  dashed  curves  are  the  corre¬ 
sponding  synthetic  traces  for  the  preferred  model  (figure  5),  in  which  the  uppermost  discontinuity  is 
permitted  to  dip  southward  15°  (striking  100°).  That  direction  was  chosen  by  comparison  with  synthetics 
of  the  polarities  and  sensitivities  of  initial  arrival  and  the  PpPjS  phase  to  misrotation. 

Another  piece  of  information  we  can  utilize  is  the  polarization  direction  of  receiver  function  arrivals 
(Langston,  1989).  The  columns  through  the  traces  in  the  upper  part  of  figure  16  delineate  the  windows  for 
which  particle  motions  are  shown  in  the  lower  part  of  the  figure.  Note  that  for  the  data,  the  polarization 
directions  of  the  initial  arrival  and  the  Moho  Ps  (columns  1  and  5)  are  both  somewhat  out  of  the  radial 
direction  expected  for  horizontal  layers,  but  for  PpPjS  of  the  first  interface  (column  3),  the  polarization 
direction  is  pointing  roughly  60°  away  from  the  radial  direction,  while  the  other  crustal  arrivals  are  mixed 
(less  linear  and  possibly  oriented  between  those  extremes).  The  absolute  values  of  polarization  directions 
are  not  robust  measurements,  subject  as  they  are  to  the  uncertainties  in  receiver  function  amplitudes 
(Baker,  et  al.,  1996),  and  to  the  additional  uncertainty  in  interface  strike  and  dip  angles  and  incoming  ray 
backazimuth  and  incidence.  The  relative  values  may  however  carry  useful  information.  The  initial  P  and 
Moho  Ps  have  roughly  similar  polarization  directions,  and  the  first  PpPjS  phase  polarization  direction  is 
much  further  from  the  radial  direction.  That  fundamental  difference  between  the  phases  is  reproduced 
well  by  the  synthetics  (lower  set  of  particle  motion  plots,  figure  16). 

While  predictions  for  the  simple  case  fit  the  data  for  some  other  groups,  there  are  significant  devia¬ 
tions  as  well.  The  data  do  not  distinguish  whether  the  deviations  are  due  to  greater  complexity  in  the 
structure '(e.g.  variation  of  dip  with  azimuth  or  strong  discontinuous  scatterers  in  the  crust),  or  to  misro¬ 
tation  of  the  seismograms  due  to  other  deeper  dipping  structures.  Overlap  of  the  much  larger  initial  arrival 
with  the  first  Ps  phase  prevents  us  from  using  azimuthal  dependence  of  the  timing  on  the  radial  compo¬ 
nent  as  was  done  for  the  Moho  Ps,  so  we  are  limited  in  what  we  can  confidently  interpret  in  tangential 
components  of  these  data. 
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To  summarize  our  investigation  of  the  tangential  component  receiver  functions,  the  available  inter¬ 
pretable  evidence  indicates  that  the  3.4  km  deep  interface  dips  southward,  although  that  conclusion  is 
admittedly  nonunique.  We  also  conclude  that  consideration  of  mislocation  vectors  is  important  for  avoid¬ 
ing  misinterpretation  of  tangential  receiver  functions.  Their  use  will  enhance  the  credibility  of 
interpretations  made  by  controlling  for  misrotation  of  the  horizontal  component  seismograms,  a  major 
source  of  error  in  tangential  arrivals. 

Tectonic  Implications: 

Crustal  Low-Velocity  Zone,  Rheology  or  Lithology? 

We  next  consider  the  contributions  of  the  above  analysis  to  our  understanding  of  the  tectonics  of  the 
study  area. 

The  shear-wave  LVZ  has  a  sharp  top  at  9  km  depth.  The  lack  of  a  clearly  identifiable  arrival  from  the 
base  of  the  LVZ  leaves  its  thickness  unconstrained,  and  suggests  that  its  base  may  be  gradational.  We 
model  it  as  being  quite  thin,  as  it  was  not  resolved  in  local  travel  time  tomography  (Scott,  1992).  A 
possible  mechanism  for  a  zone  of  shear-wave  low- velocity  is  a  layer  of  saline  fluid,  like  that  inferred  by 
Park  et  al.  (1992).  We  consider  carefully  the  mechanism  that  would  provide  an  impermeable  boundary. 
Park  et  al.  attribute  it  to  the  brittle-ductile  transition,  estimated  by  Doser  and  Kanamori  (1986)  to  be  at 
11-12  km  depth  in  the  eastern  Peninsular  Ranges,  because  80%  of  the  seismicity  occurs  above  that  depth. 
The  seismicity  deepens  and  heat  flow  decreases  in  the  northern  Peninsular  Ranges  (Doser  and  Kanamori, 
1986),  leading  to  estimates  of  a  deeper  transition  zone.  Doser  and  Kanamori  (1986)  use  a  rheological 
model  with  granite  to  14  km  depth,  and  diabase  beneath,  for  the  area  around  PFO,  and  while  they  do  not 
make  an  exact  estimate  of  depth  to  the  brittle-ductile  transition,  they  place  it  in  the  diabase.  However, 
Bailey  (1990)  cautions  that  the  arguments  for  trapping  of  saline  fluid  at  the  brittle-ductile  transition  are 
"intended  to  apply  to  stable  continental  crust,  not  to  tectonically  active  regions".  A  requirement  for  trap¬ 
ping  fluids  is  that  the  least  principal  stress  be  vertical,  so  as  to  permit  horizontal  fracturing.  The  T  axes  of 
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earthquake  mechanisms  throughout  the  northern  Peninsular  Range  and  eastern  Transverse  Range  are 
however,  horizontal  (Seeber  and  Armbruster,  1995),  suggesting  that  any  fluids  migrating  upward  would 
not  stop  at  the  transition,  but  only  at  some  other  boundary.  Beneath  PFO,  that  boundary  could  be  the  10 
to  12  km  deep  base  of  the  granitic  portion  of  the  batholith  (Jachens  et  ah,  1991).  Given  the  uncertainties 
in  depth  estimations  from  both  receiver  functions  and  aeromagnetic  data,  the  LVZ  and  base  of  the  batho¬ 
lith  could  coincide.  From  the  measured  surface  heat  flow  in  the  vicinity  of  PFO  of  ~60  mW/m2 
(Lachenbruch,  1985),  granite  should  remain  brittle  well  below  14  km  depth  (Doser  and  Kanamori,  1986), 
and  be  impermeable  to  fluids  (Bailey,  1990).  From  the  above  discussion,  we  find  saline  fluids  trapped  at 
the  base  of  the  granitic  portion  of  the  batholith  to  be  the  most  likely  tectonic  explanation  of  the  inferred 
crustal  LVZ.  The  interpretation  is  however  nonunique. 

An  alternate  explanation  is  supplied  by  Min  and  Wu  (1987),  who  demonstrate  that  granitic  rock  in  a 
region  of  high  heat  flow  will  develop  a  pronounced  LVZ.  This  provided  a  mechanism  for  a  low  Vp  /Vs 
ratio  observed  for  reflection-refraction  data  from  the  Tibetan  Plateau,  where  the  heat  flow  measured  in  2 
lakes  was  91  mW/m2  and  146  mW/m2  (Francheteau  et  al.,  1984).  High  thermal  expansion  of  quartz 
relative  to  neighboring  grain  boundaries  is  hypothesized  to  cause  microcracking  at  pressures  correspond¬ 
ing  to  depths  from  10  to  20  km,  increasing  until  the  temperature  reaches  650°  C  when  hardening 
associated  with  a  phase  change  to  beta-structure  occurs  (Kern,  1982).  If  this  mechanism  were  responsible 
for  the  LVZ  at  PFO,  we  would  expect  a  LVZ  up  to  10  km  thick,  with  P  velocities  dropping  even  more 
than  S  velocities.  As  we  discussed  in  the  crustal  modeling  section,  we  have  no  constraints  from  receiver 
functions  to  distinguish  the  P  velocity  contrast.  Scott  (1992)  however,  in  her  travel  time  tomography 
study  of  the  area,  resolved  no  changes  in  Poisson  ratio  throughout  the  depth  range  that  we  have  modeled 
as  having  the  LVZ,  and  resolved  no  LVZ  in  P  or  S  velocities.  Thus,  the  base  of  the  granitic  portion  of  the 
batholith  would  have  to  be  reached  at  least  within  a  couple  of  kilometers  depth  beneath  the  top  of  the 
LVZ,  for  this  mechanism  to  be  invoked  and  produce  a  LVZ  sufficiently  thin  to  remain  unresolved  in  the 
travel  time  tomography.  While  not  impossible,  that  seems  to  require  more  fortuity  than  the  first 
explanation.  The  low  heat  flow  measured  at  the  surface  also  would  seem  to  argue  against  this  mechanism, 
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but,  as  pointed  out  by  Park  et  al.  (1992),  the  surface  heat  flow  of  60  mW/m2  reflects  temperatures  at  10 
km  3  m.y.  ago.  The  opening  of  the  Salton  Trough  in  this  region  between  4  and  5  m.y.  ago  (Lachenbruch 
et  al.  1985)  has  possibly  influenced  temperatures  in  the  deep  crust  and  mid-crust,  although  those  changes 
would  not  yet  be  apparent  at  the  surface. 

Three-component  recordings  of  teleseismic  data  in  the  same  area  as  the  magnetotelluric  traverse 
would  be  helpful  by  permitting  determination  of  whether  a  low  resistivity  zone  and  a  seismic  LVZ  are 
coincident  there.  A  seismic  reflection  profile  to  determine  whether  there  is  a  bright  reflector  at  the  same 
depth  would  also  be  useful  independent  information,  particularly  as  there  is  some  correlation  between 
low  resistivity  and  crustal  reflectors  (e.g.  Hyndman  and  Shearer,  1989). 

Moho  Topography 

Our  results  relate  to  Moho  topography  on  two  length  scales,  the  larger  scale  being  the  average  over 
15  to  20  km  diameter  beneath  PFO  of  thicker  crust  northwest  of  PFO  (figures  9  and  10)  and  the  smaller 
scale  being  the  inference  of  irregularity  of  the  Moho  with  significant  depth  variation  over  just  kilometers 
of  horizontal  distance  (figure  17,  inferred  from  figures  11-15).  The  larger  scale  result  -  the  Moho  model 
dipping  15°  to  20°  to  the  northwest  that  best  fits  receiver  function  Ps-P  times  (figure  1 1,  left  side)  and  is 
corroborated  by  Ps/P  amplitude  variations  (figure  12)  and  PpPmp-P  times  (figure  14)  -  is  consistent  with 
the  dominant  topography  being  in  isostatic  balance,  that  is,  with  San  Jacinto  Peak  having  a  thick  crustal 
root. 

The  non-planar  Moho  topography  and  very  steep  Moho  offsets  over  quite  short  horizontal  length 
scales  are  more  difficult  to  interpret,  but  potentially  more  interesting.  The  gravity  data  does  not  resolve 
whether  Airy  isostasy  can  be  extended  to  those  length  scales.  Nonetheless,  the  agreement  of  Airy  isostasy 
and  our  larger  scale  observations,  and  their  disagreement  with  elastic  support,  suggest  an  extension  of 
Airy  isostasy  to  the  shorter  scale  observations.  That  is,  the  fracturing  of  the  crust  and  subsequent  uplift  of 
the  mountains  due  to  unleashed  buoyancy  forces  could  account  for  topography  on  a  smaller  horizontal 


Figure  2.17:  North-south  (upper  left)  and  nearly  east- west  (upper  right)  topographic  cross-sections 
through  PFO.  The  position  of  the  second  cross-section  is  indicated  by  a  dashed  line  through  the 
center  of  the  circle  in  the  left  side  of  figure  11.  PFO’s  position  is  indicated  by  the  triangle.  The 
lower  plots  indicate  Moho  depth  for  each  cross-section  inferred  by  assuming  Airy  isostasy  with 
the  depth  of  compensation  at  the  Moho.  The  short  horizontal  bars  are  estimates  of  Moho  depth 
based  on  receiver  functions. 


38 


scale  than  San  Jacinto  Peak.  Support  for  this  speculation  is  given  in  Figure  17.  There  we  have  plotted  the 
topography  along  north-south  (left)  and  nearly  east-west  (right)  cross-sections  (the  dashed  line  12°  south 
of  east  in  the  left  side  of  figure  11).  Beneath  the  topographic  profiles  are  the  predicted  Moho  depth  pro¬ 
files,  based  on  Airy  isostasy  with  compensation  at  Moho  depth.  The  horizontal  bars  represent  estimates 
of  Moho  depth  along  the  profiles.  Most  estimates  are  based  on  the  average  Ps-P  times  of  well  over  10 
events.  The  estimates  are  based  on  the  depth  of  flat  lying  Moho.  Estimates  based  on  dipping  layers  would 
give  similar  relative  results,  but  given  the  uncertainty  in  absolute  measurements  we  chose  to  use  the 
simplest  model  possible  for  depth  estimates.  Positions  are  based  on  the  intersection  of  the  Ps  phase  with 
the  northwest  dipping  model  of  the  left  side  of  figure  11.  While  this  is  the  best  estimate  available  to  us, 
based  on  minimization  of  the  rms  error  in  Ps-P  times,  it  is  clearly  subject  to  large  errors,  and  so  correlating 
fine  details  of  the  depth  estimates  and  predicted  topography  would  be  speculative.  To  first  order  however, 
the  receiver  function  depth  estimates  match  the  predicted  Moho  depth  variations  well.  The  only  exception 
is  the  bar  at  3 1  km  depth,  plotted  with  a  dashed  line,  at  1 0  km  east  of  PFO.  That  estimate  involved  just  5 
records,  but  they  were  all  consistent.  The  bar  at  1 1  to  12  km  east  of  PFO  at  25  km  depth  involved  4 
records.  Those  two  estimates  were  made  from  clusters  of  events  centered  approximately  4  km  south  and 
north  respectively  of  the  cross-section  (the  large  pluses  and  large  circles  north  and  south  of  the  cross- 
section  line  of  the  left  side  of  figure  11).  All  other  estimates  were  made  using  events  whose  Ps  phases 
crossed  the  Moho  within  approximately  a  kilometer  of  one  of  the  cross-sections.  We  originally  identified 
the  Moho  Ps  as  the  largest  phase  between  2.5  and  4.5  seconds  after  P,  and  interpret  the  cluster  of  anom¬ 
alously  large  Ps-P  times  east  of  PFO  as  due  to  misidentification  of  a  later  phase  as  a  Moho  Ps.  From  the 
anomalous  data  then,  which  sample  the  Moho  closer  to  the  Salton  Trough  than  any  other  data,  we  infer 
that  the  Moho  becomes  more  gradational  there  due  higher  temperature,  and  so  produces  a  much  smaller 
converted  phase. 

We  also  note  that  other  abrupt  Moho  offsets  have  been  observed  in  the  Pyrenees  (a  15-20  km  vertical 
offset  has  been  inferred  from  refraction  data  (Daignieres,  et  al.,  1989)),  and  in  the  Catalina  Mountains, 
Arizona,  (4  km  of  Moho  depth  variation  has  been  inferred  from  receiver  function  modeling  (Myers  and 
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Beck,  1994)). 

A  final  feature  noted  in  the  receiver  functions  is  the  large  negative  phase  that  follows  the  Moho  Ps. 
Its  interpretation  as  the  Ps  phase  from  an  upper  mantle  low- velocity  zone  is  consistent  with  results  of  the 
PACE  refraction  line  (Benz  and  McCarthy,  1994).  As  this  phase  tracks  the  Moho  Ps  phase  regardless  of 
the  Moho  phase’s  timing,  the  above  interpretation  would  require  that  the  high-velocity  upper  mantle  lid 
be  of  fairly  uniform  thickness  and  be  offset  with  the  crust.  This  does  not  seriously  affect  the  assumption 
of  compensation  depth  at  the  Moho,  as  density  variations  between  the  upper  mantle  lid  and  LVZ  are 
thought  to  be  very  small.  They  are  less  than  one  percent  in  PREM. 


Conclusions 

Receiver  function  modeling  has  yielded  a  one  dimensional  velocity  model  beneath  PFO  which  in¬ 
cludes  a  high  contrast  shear-wave  velocity  inversion  at  9  km  depth  (figure  6).  Travel  time  tomography 
results  have  proven  to  be  a  useful  complement  to  receiver  function  studies.  Where  receiver  functions  tell 
us  of  the  existence  of  discontinuities  and  place  them  on  a  velocity-depth  curve,  tomography  tends  to 
smear  out  discontinuities  but  provides  us  with  average  slownesses  which  allow  us  to  determine  the  depths 
of  discontinuities  with  greater  confidence.  Inclusion  of  results  from  travel  time  tomography  near  PFO 
(Scott,  1992,  Scott  et  al.,  1994)  has  provided  useful  constraints  in  this  study  on  both  P  and  S  velocities. 
Further,  tomography  results  have  constrained  the  LVZ  imaged  in  receiver  functions,  but  apparently 
smoothed  over  in  the  tomography,  to  be  a  very  thin  layer. 

The  overall  tendency  toward  thicker  crust  to  the  northwest  of  PFO  and  the  non-planar  Moho  topog¬ 
raphy  and  rapid  spatial  variation  of  Moho  depth  inferred  from  the  spatial  variations  of  receiver  function 
Ps-P  times,  Ps/P  amplitudes,  and  PpPmp-P  times,  is  consistent  with  a  model  of  Airy  isostasy  with  depth 
of  compensation  at  the  Moho,  operating  over  length  scales  of  only  kilometers.  This  agrees  with  gravity 
data,  to  the  extent  of  its  resolution.  The  extension  of  Airy  isostasy  compensated  at  Moho  depth  to  such  a 
fine  scale  suggests  that  fracturing  of  the  crust  at  the  time  of  uplift  of  the  San  Jacinto  Mountains  was  not 
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limited  to  the  major  transform  faults.  Rather,  the  crust  was  fractured  everywhere,  and  the  current  topog¬ 
raphy  is  due  to  the  extreme  apparent  weakness  of  the  crust  at  that  time. 
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Chapter  3 


Uncertainty  of  Receiver  Function  Waveforms  and  Implications  for  Modeling 


Abstract 

This  paper  provides  a  thorough  examination  of  the  uncertainty  and  bias  in  the  computation  and 
interpretation  of  receiver  functions.  Real  data  are  used  to  quantify  uncertainty  in  receiver  function 
waveforms  and  the  resulting  limitations  in  their  interpretation  are  explored.  A  set  of  realistic  synthetic 
seismograms  are  used  to  investigate  the  effects  of  additive  noise  and  regularization  on  receiver 
function  amplitudes.  The  synthetics  are  computed  by  raytracing  through  a  simple  earth  model, 
convolution  of  the  resulting  seismogram  with  different  observed  P-wave  signals,  and  the  addition  of 
different  samples  of  real  vertical  and  horizontal  seismic  noise. 

We  make  a  bootstrap  estimate  of  the  standard  deviation  of  receiver  functions  for  a  high  quality  real 
data  set  and  map  this  uncertainty  into  the  velocity  contrasts  determined  at  model  interfaces.  Bias  in  the 
regularized  deconvolution  of  noisy  data  is  found  to  be  comparable  to,  or  greater  than,  the  uncertainty 
indicated  by  the  standard  deviations  of  the  mean  receiver  function. 

We  examine  the  effectiveness  of  averaging  functions  applied  to  receiver  functions  to  correct  the 
amplitudes  of  the  initial  peak.  We  find  that  normalization  by  averaging  functions  can  restore  the  initial 
P  amplitudes  of  individual  synthetics  even  for  high  noise  levels.  The  initial  P  amplitudes  may  be 
underestimated  in  stacks  of  many  receiver  functions  if  the  normalization  is  performed  using  the 
maximum  rather  than  zero  lag  amplitude  of  the  receiver  functions.  The  averaging  functions  greatly 
underestimate  the  extent  of  waveform  distortion  introduced  by  regularized  deconvolution. 

We  use  the  velocity  spectrum  stacking  technique,  applied  to  data  recorded  from  a  broad  range  of 
ray  parameters  at  a  shield  site,  to  investigate  the  trade-off  between  depth  and  velocity  for  models 
derived  from  receiver  functions  studies. 
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Finally,  we  consider  the  effects  of  inadequacy  in  the  physical  assumptions  commonly  made  for 
receiver  function  interpretation.  We  show  how  inaccuracy  in  the  assumed  backazimuth  and  incidence 
of  the  signal  can  lead  to  significant  errors  in  the  interpretation  of  absolute,  but  not  relative,  receiver 

function  amplitudes. 

We  conclude  that: 

1)  Arrival  times  of  identifiable  receiver  function  phases  are  robust  for  what  we  consider  the  usable 
range  of  signal -to-noise  ratio  levels  and  are  the  primary  source  of  information  derived  from  receiver 
functions.  They  indicate  the  presence  of  discontinuities  and  their  depths,  given  reasonable  independent 
estimates  of  slowness. 

2)  Receiver  function  amplitudes  provide  poor  constraints  on  the  absolute  magnitudes  of  the  velocity 
contrasts  and  the  dips  of  interfaces.  Relative  receiver  function  amplitudes  can,  however,  be  used  to 
infer  relative  velocity  contrasts  at  different  interfaces  and  approximate  orientations  of  dipping 
interfaces. 

Introduction 

Receiver  function  analysis  is  a  more  recently  popularized  but  not  as  widely  accepted  technique  as 
refraction  and  reflection  profiling  or  tomography.  Skepticism  of  the  results  from  receiver  function 
studies  may  be  due  to  incomplete  understanding  of  the  strengths  and  limitations  of  the  technique  which 
occasionally  results  in  optimistic  over-interpretation  of  receiver  functions.  This  paper  is  an  attempt  to 
improve  the  credibility  of  receiver  functions  by  carefully  discussing  the  limitations  of  their 
interpretation  and  to  outlining  those  aspects  of  Earth  structure  that  can  most  effectively  be  modeled 
using  this  technique. 

Receiver  functions  represent  P-to-S  converted  phases  generated  at  interfaces  beneath  seismic 
stations,  isolated  from  other  complexities  in  teleseismic  waveforms  by  deconvolution  of  the  vertical 
component  of  a  seismogram  from  its  horizontal  components  (Burdick  and  Langston,  1977,  Vinnik, 
1977;  Langston,  1979).  This  technique  has  been  used  to  infer  crustal  structure  (e.g.  Langston,  1979; 
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Owens,  1984),  subducting  slab  geometry  (e.g.  Cassidy,  1993),  and  upper  mantle  discontinuities  (e.g. 
Gurrola,  et  al,  1995;  Vinnik,  1977,  Kind  and  Vinnik,  1983,  Bostock,  1996).  The  utility  of  receiver 
function  waveforms  in  constraining  Earth  structure  have  been  addressed  largely  via  noise  free  synthetic 
studies  (Ammon,  et  al.,  1990,  Cassidy,  1992).  We  extend  that  work  by  considering  the  uncertainty  in 
receiver  function  waveforms,  using  real  data,  and  by  using  realistic  noisy  synthetic  seismograms  to 
examine  some  causes  of  that  uncertainty. 

Ammon  et.  al.  (1990)  addressed  resolution  and  uniqueness  of  one-dimensional  velocity  structures 
using  noise-free  synthetic  data  and  automatic  waveform  inversions,  and  concluded  that  large,  sharp 
velocity  discontinuities  can  be  identified  and  placed  on  a  depth-velocity  curve,  while  gradational 
velocity  variations  (with  -10%  change)  are  not  well  resolved.  They  also  demonstrated  by  varying  layer 
thicknesses,  P-wave  velocities  (Vp),  S-wave  velocities  (Vs),  and  densities,  that  a  wide  range  of  models 
can  produce  identical  receiver  functions.  Cassidy  (1992)  examined  the  importance  of  absolute 
amplitudes  to  the  inference  of  dip  angle  of  an  interface  and  discussed  a  potential  pitfall  of  modeling 
Ps/P  amplitude  ratios.  Ammon  (1991)  introduced  a  technique  to  correct  receiver  function  amplitudes 
by  normalizing  by  averaging  functions  and  discussed  its  importance  in  constraining  near  surface 
velocities. 

We  extend  that  work  in  several  ways.  We  estimate  the  statistical  uncertainty  in  receiver  functions 
computed  from  real  data  and  consider  how  that  maps  into  inferred  structure.  We  use  "noisy"  synthetics 
to  isolate  the  effects  of  additive  noise  and  degree  of  regularization  on  receiver  functions  waveforms. 
We  also  demonstrate  that  the  regularized  deconvolution  of  noisy  data  can  introduce  a  large  bias  into 
receiver  function  amplitudes  We  also  investigate  an  important  potential  pitfall  in  modeling  absolute 
amplitudes  which  may  result  from  the  error  in  assuming  that  the  teleseismic  P  wave  always  arrives  at 
the  backazimuth  and  incidence  predicted  from  the  relative  event-receiver  positions. 

We  use  velocity  spectrum  stacks  (VS S)  to  place  bounds  on  the  range  of  depths  and  velocities  that 
can  produce  similar  phase  delays  across  the  range  of  teleseismic  ray  parameters. 
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Bootstrap  Estimate  of  the  Standard  Deviation 

For  most  of  the  examples  in  this  section  we  use  records  from  29  teleseismic  events,  within  a  small 
range  of  ray  parameters  from  the  recording  station,  that  were  used  to  infer  a  one-dimensional  velocity 
model  (Baker  et  al.,  1996).  A  bootstrap  estimate  of  the  standard  deviation  of  the  receiver  function  from 
those  seismograms  is  determined  (Efron,  1979,  1982).  This  is  done  by  randomly  selecting,  with 
replacement  (i.e.  a  seismogram  may  be  chosen  multiple  times),  29  seismograms  from  the  original  set  of 
29,  and  calculating  their  receiver  function.  This  is  repeated  100  times,  and  the  standard  deviation  of  the 
resulting  100  receiver  functions  is  an  estimate  of  the  standard  deviation  of  the  receiver  function 
calculated  from  the  original  29  seismograms.  In  our  first  example  we  complete  this  procedure  using  the 
simultaneous  time-domain  deconvolution  technique  of  Gurrola  et.  al  (1995).  Figure  1  (right  side) 
shows  2  standard  deviation  error  bounds  about  the  mean  of  these  100  receiver  functions.  The 
Kolmogorov-Smirnov  test  indicates  that  the  amplitude  distribution  at  any  fixed  time  of  these  receiver 
functions  is  Gaussian.  If  the  errors  are  also  uncorrelated,  2  standard  deviations  represent  the  95% 
confidence  interval. 

Confidence  Intervals  on  Velocity  Jump  Amplitudes 

We  use  the  forward  modeling  approach  of  Baker  et  al.  (1996)  to  map  the  95  percent  confidence 
interval  of  the  receiver  function  amplitudes  into  minimum  confidence  intervals  for  the  size  of  the 
velocity  contrasts  in  their  model  by: 

1)  adopting  the  same  independently  determined  surface  P  and  S  wave  velocities  as  Baker  et  al.  (1996). 

2)  choosing  the  depth  of  the  first  (next)  discontinuity  to  fit  the  peak  arrival  time  of  the  first  (next)  Ps 
phase. 

3)  determining  the  range  of  velocities  in  the  second  layer  that  produce  synthetic  receiver  functions 
whose  Ps  phase  spans  the  two  standard  deviation  amplitude  range  of  the  corresponding  observed  Ps 
phase.  For  example,  the  velocity  contrast  at  the  shallowest  discontinuity  in  the  model  can  range  from 


0.8  to  1.5  km/s. 
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To  estimate  confidence  intervals  for  deeper  discontinuities,  we  assume  velocities  for  the  shallower 
layers  that  precisely  fit  the  mean  amplitudes  of  the  bootstrapped  receiver  functions  and  then  repeat 
steps  (2)  and  (3). 

By  assuming  the  values  of  velocity  and  thickness  of  the  shallower  layers  that  provide  an  exact  fit 
to  the  earlier  receiver  function  phases,  we  have  optimistically  estimated  a  "best  case"  uncertainty  in  the 
velocity  contrast  at  each  discontinuity  (figure  1,  left  side).  These  are  not  estimates  of  uncertainty  in  the 
absolute  values  of  the  velocities,  and  represent  minimum  confidence  intervals  for  two  reasons.  First, 
we  have  ignored  errors  in  depth,  Vp,  and  Vs  of  shallower  layers  that  can  increase  the  absolute  error  in 
velocities  of  deeper  layers.  We  note  that  even  without  including  the  effects  of  errors  propagated  from 
shallower  layers,  the  uncertainty  in  the  inferred  velocity  contrast  increases  at  each  successive  interface, 
which  reflects  the  increased  error  in  receiver  function  amplitudes  as  a  function  of  time  delay  (figure  1, 
right  side).  Second,  so  far  we  have  only  considered  the  limitation  on  the  resolution  of  velocity  structure 
due  to  statistical  uncertainty  in  receiver  function  amplitudes.  We  next  discuss  the  bias  introduced  by 
deconvolution. 

Amplitude  Bias  due  to  the  Deconvolution 

Receiver  function  amplitudes  are  usually  presented  as  unbiased  estimates  of  the  true  amplitude. 
That  is,  it  is  assumed  that  the  true  amplitude  has  a  high  probability  of  being  within  some  uncertainty 
bounds  about  the  estimated  amplitude.  In  fact,  the  error  in  receiver  function  amplitudes  due  to  bias 
introduced  by  low  signal-to-noise  level  or  by  the  deconvolution  itself  may  be  even  greater  than  that 
indicated  by  the  statistical  uncertainty. 

We  explore  two  related  possible  sources  of  error,  that  due  to  the  type  of  deconvolution  we  employ 
and  that  induced  by  the  extent  of  regularization  applied  to  the  deconvolution.  The  simultaneous  time 
domain  deconvolution  is  set  up  as  a  regularized  inverse  problem  using  the  method  of  Lagrange 
multipliers  (Gurrola  et  al,  1995).  Specifically,  we  write  Ax=b,  where  A  is  the  convolution  matrix  of  a 
vertical  seismogram,  b  is  the  horizontal  seismogram,  and  x  is  the  receiver  function.  The  regularized 
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solution  is  x=(  A  T A+  Al) +  AT b.  The  Lagrange  multiplier,  X,  balances  the  trade-off  between  misfit  and  the 
L2  norm  of  the  model.  In  choosing  X  we  attempt  to  balance  what  qualitatively  appear  to  be  excessive 
high  frequency  noise  and  excessive  ringing.  The  choice  of  X  can  be  made  consistent  by  automating  its 
selection,  often  based  on  the  curve  of  misfit  versus  model  norm,  or  even  based  on  some  a  priori 
knowledge  of  noise  in  the  system.  We  find  that  within  the  range  of  acceptable  values  of  X,  receiver 
function  amplitudes  vary  significantly  (figure  2). 

Regularization  is  required  because  the  data  are  contaminated  by  noise.  In  the  formulation  chosen, 
we  are  ignoring  the  presence  of  noise  in  the  vertical  records  that  are  used  to  construct  the  A  matrix.  In 
teleseismic  body  wave  recordings,  the  vertical  seismograms  have  higher  signal-to-noise  ratios  than 
those  of  the  horizontal  seismograms.  Synthetic  tests  we  have  performed,  in  which  we  compare  the 
results  of  deconvolution  of  noisy  and  noise-free  vertical  seismograms  from  noisy  horizontal 
seismograms  at  typical  signal-to-noise  levels  for  both,  indicate  that  ignoring  the  effects  of  noise  on  the 
vertical  seismograms  is  not  a  dominant  source  of  uncertainty  in  receiver  functions. 

In  the  frequency  domain  deconvolution  for  receiver  functions,  regularization  is  accomplished  by 
the  'water  level’  technique  in  which  all  values  of  the  power  spectrum  of  the  vertical  component  of  the 
seismogram  below  a  specified  minimum  value  are  replaced  by  that  value  (Oldenburg,  1981;  Langston, 
1979,  Owens  et  al,  1984).  High  frequency  noise  introduced  in  the  frequency  domain  deconvolution  is 
usually  eliminated  by  low  pass  filtering  (usually  with  a  Gaussian  filter)  after  deconvolution.  In  figure  2 
we  show  four  different  receiver  functions,  each  calculated  from  the  same  25  seismograms  of  events 
from  a  single  source  region  (Baker  et  al.,  1996).  The  top  and  second  traces  were  computed  using 
Lagrange  multipliers  of  1000  and  5000,  applied  to  the  simultaneous  time  domain  deconvolution  of 
these  25  seismograms.  The  third  and  fourth  traces  were  computed  in  the  frequency  domain  using  water 
levels  of  10-5  ancj  jg-4  an(j  Gaussian  filters  of  half-width  5.  These  four  receiver  functions  appear  to  be 
equally  good  candidates  for  modeling. 

In  interpreting  receiver  functions,  we  usually  rely  on  the  amplitude  ratio  of  secondary  arrivals  to 
the  initial  peak  more  than  we  rely  on  absolute  amplitudes.  We  therefore  consider  the  variation  in  that 
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amplitude  ratio.  Each  trace  is  labeled  with  that  receiver  function’s  Moho  Ps/P  amplitude  ratio.  The  Ps/P 
ratio  of  the  second  trace  is  17%  larger  than  that  of  the  third  trace,  and  we  do  not  know  which  of  these 
more  closely  represents  the  "correct”  amplitude  ratio.  From  this  we  conclude  that  significant  systematic 
error,  in  addition  to  the  statistical  uncertainty  estimated  in  the  previous  section,  is  likely  present  in 
receiver  function  amplitudes.  We  also  observe  that  estimated  amplitudes  of  smaller  arrivals  are  even 
less  stable  than  the  Moho  Ps  amplitude. 

The  Effect  of  Additive  Noise 

To  investigate  the  error  due  to  additive  noise  and  the  regularization  required  in  the  deconvolution 
operation,  we  use  realistic  synthetic  seismograms  computed  for  a  layer  over  a  half  space  model. 

To  produce  the  noisy  synthetic  seismograms 

1)  We  first  compute  generalized  ray  theory  synthetic  seismograms  (Langston  and  Helmberger,  1975) 
through  a  simple  one  layer  model; 

2)  We  convolve  this  synthetic  with  25  different  source  functions  (generated  by  windowing  the  first  6. 
seconds  of  teleseismic  P-arrivals  and  tapering  the  trailing  end),  to  produce  25  different  synthetic 
seismograms. 

3)  We  then  add  25  different  segments  of  real  horizontal  and  vertical  seismic  noise  to  the  appropriate 
synthetics  (figure  3). 

We  perform  the  simultaneous  time-domain  deconvolution  for  the  25  synthetic  seismograms  for  a 
range  of  noise  levels.  For  each  successively  lower  level  of  signal  to  noise  ratio,  a  larger  Lagrange 
multiplier  was  used  to  reduce  the  successively  higher  levels  of  high  frequency  oscillations  in  the 
solutions.  Figure  4  shows  the  effect  of  the  higher  noise  level  and  degree  of  regularization  on  Ps/P 
amplitude  ratios.  The  uncertainty  bounds  are  ±2  standard  deviations,  estimated  by  bootstrap  as  was 
done  for  figure  1.  The  number  above  each  datum  is  the  Lagrange  multiplier.  The  receiver  functions 
calculated  from  the  25  events  at  each  noise  level  are  shown  below  each  datum.  The  horizontal  dotted 
line  at  0.380  indicates  the  expected  value  of  the  Ps/P  amplitude  ratio.  As  the  S/N  decreases,  the  Ps/P 
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amplitude  ratios  also  decrease,  to  the  point  that  the  bias  becomes  a  greater  source  of  error  than  the 
statistical  uncertainty.  In  fact,  the  error  bounds  due  to  statistical  uncertainty  do  not  even  increase 
significantly  until  the  lowest  S/N  is  reached.  That  is  partly  because  the  successively  higher  levels  of 
regularization  needed  to  damp  out  high  frequency  oscillations  in  the  receiver  functions  at  higher  noise 
levels,  help  to  keep  the  variation  in  amplitudes  small,  even  as  the  mean  amplitude  shifts  further  and 
further  from  the  true  solution.  From  this  we  can  see  that  error  bounds  based  on  the  statistical 
uncertainty  are  not  indicative  of  the  accuracy  of  the  estimate.  This  is  troubling  when  we  consider  that 
for  S/N=10  (figure  4),  the  large  individual  receiver  function  arrivals  stand  out  much  further  above  the 
background  noise  level  than  do  arrivals  in  real  receiver  functions.  The  frequency  domain 
deconvolution  produces  a  very  similar  result. 

The  effects  of  noise  level  and  regularization  are  not  independent.  A  given  level  of  regularization 
will  produce  different  results  depending  on  the  level  and  character  of  noise  in  the  data.  Even  so,  we  can 
observe  something  of  the  relative  effects  of  noise  and  regularization  level  in  figure  5.  There  we  see  the 
receiver  functions  calculated  from  the  25  synthetic  seismograms  at  a  range  of  S/N  levels  and  levels  of 
regularization  (each  column  is  a  constant  S/N,  each  row  is  a  constant  A.).  One  striking  effect  is  the 
difference  between  over-  and  under-damped  receiver  functions.  Receiver  functions  with  no  damping 
(bottom  row)  all  have  excessive  high  frequency  noise,  which  decreases  with  an  increase  in  X.  Receiver 
functions  with  increasingly  larger  X  are  much  smoother,  but  arrivals  begin  to  exhibit  large  sidelobes. 
Also,  for  a  "good"  receiver  function,  increased  regularization  decreases  the  Ps/P  amplitude  ratio.  For 
example,  at  virtually  no  noise  (the  right  column),  the  receiver  function  with  no  regularization  has  the 
correct  amplitude.  One  might  however  prefer  the  receiver  function  at  X=10^,  as  it  has  much  less  high 
frequency  noise,  so  the  significant  arrivals  are  more  prominent,  but  the  Ps/P  amplitude  ratio  is  much 
less  accurate.  One  reason  for  that  is  that  increasing  the  regularization  level  increasingly  minimizes  the 
norm  of  x,  in  addition  to  satisfying  the  data,  i.e.,  minimizing  Ax-b.  This  does  not  mean  that  under- 
regularized  receiver  functions  will  provide  more  accurate  amplitudes  for  real  receiver  functions.  In  our 
simulations  we  can  not  obtain  much  more  accurate  amplitudes  for  higher  noise  level  receiver  functions 
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Time  Domain:  25  Events 


Figure  3.4:  Ps/P  amplitude  ratios  for  the  receiver  functions  calculated  from  25  synthetics  of  the 
type  illustrated  in  figure  3,  at  various  signal  to  noise  levels.  The  true  Ps/P  amplitude  ratio  of 
0.380  is  indicated  by  the  dashed  line.  The  uncertainty  bounds  are  bootstrap  estimates  of  1 
standard  deviation  of  the  distribution.  The  value  of  the  Lagrange  multiplier  is  shown  above 
each  datum,  and  the  receiver  function  for  each  is  shown  below.  The  Ps/P  amplitude  ratio 
decreases  with  increasing  noise  and  regularization. 
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Figure  3.5:  This  figure  illustrates  the  effect  of  varying  noise  level  (horizontal  axis)  and  level  of  regularization  (vertical  axis)  on  receiver 
function  waveforms.  The  receiver  functions  are  calculated  with  the  simultaneous  time  domain  method  from  25  noisy  synthetic  seismograms. 
The  numbers  above  each  receiver  function  are  the  Ps/P  amplitude  ratios. 


58 


by  applying  less  regularization.  At  too  low  a  regularization  level,  the  true  arrivals  are  overwhelmed  by 
the  high  frequency  noise  (figure  5).  Also,  these  experiments  are  idealistic  in  that  they  only  address  one 
source  of  error,  namely  additive  noise. 

The  effects  of  increasing  regularization  at  higher  noise  levels  are  also  more  complicated.  For 
example,  at  S/N=10  and  X=0  (second  column,  bottom  row,  figure  5),  the  later  arrivals  are 
indistinguishable  in  the  high  frequency  noise.  As  X  increases  to  103  (second  row  up),  the  major  arrivals 
become  more  distinct  and  the  Ps/P  amplitude  ratio  becomes  somewhat  more  accurate.  At  A,=104  (third 
row  up),  the  receiver  function  has  a  great  deal  of  long  period  noise,  and  the  Ps/P  amplitude  ratio  has 
decreased  again.  At  )c=105  (top  row),  the  receiver  function  actually  looks  quite  good.  It  is  very  smooth, 
and  although  there  are  significant  sidelobes,  especially  around  the  initial  peak,  the  messy  erratic  look 
has  improved.  The  Ps/P  amplitude  ratio  has  also  increased  again.  This  improvement  is  misleading 
however,  and  clearly  illustrates  a  danger  in  overdamping.  It  is  commonly  thought  that  if  roughness  is 
heavily  penalized,  then  any  arrivals  still  in  the  model  must  be  necessary  to  fit  the  data.  Here  we  see  a 
nice  looking  receiver  function,  with  a  clear  negative  arrival,  which  is  completely  spurious,  between  the 
first  and  second  legitimate  secondary  arrivals.  Such  features  can  be  produced  in  overdamped  models  to 
provide  the  next  best  fit  to  the  data  when  the  true  model  features  are  penalized. 

Figure  5  also  indicates  the  fallibility  of  the  argument  that  one  should  damp  all  receiver  function 
calculations  similarly,  in  the  mistaken  belief  that  this  will  treat  all  the  data  similarly  so  the  results  will 
be  comparable.  The  effect  of  a  given  level  of  damping  varies  greatly  depending  on  the  structure  of  the 
noise  relative  to  the  forward  model. 

To  further  illustrate  the  effects  of  varying  noise  level  on  waveforms,  each  of  the  25  individual 
receiver  functions  for  each  S/N  level  are  shown  in  the  right  hand  column  of  figure  6.  These  receiver 
functions  were  calculated  using  the  frequency  domain  deconvolution  of  Ammon  (1991)  which  attempts 
to  preserve  absolute  amplitudes.  As  the  signal  to  noise  level  decreases,  the  amplitudes  of  recognizable 
phases  begin  to  vary.  The  initial  P  and  Ps  amplitudes  given  in  figure  6  are  the  averages  of  the  25 
receiver  functions  at  each  S/N  level.  The  2  standard  deviations  errors  listed  indicate  the  statistical 
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uncertainty  of  the  measurements.  For  receiver  functions  with  no  noise  (top  row,  right  side),  the 
amplitudes  are  essentially  correct  -  the  expected  P  and  Ps  amplitudes  are  0.445  and  0.169  respectively. 
Note  however  the  variation  in  the  waveforms.  The  set  of  receiver  functions  (on  the  top  left  of  figure  6) 
are  computed  from  synthetic  seismograms  with  no  noise  added.  The  only  difference  between  the 
seismograms  used  is  the  shape  of  the  synthetic  source  function  convolved  with  each. 

As  was  observed  for  the  time  domain  receiver  functions,  the  accuracy  of  Ps  arrival  amplitudes 
deteriorates  as  S/N  decreases  (S/N  are  given  to  the  right  of  each  set  of  traces),  even  though  the  receiver 
functions  still  look  acceptable.  For  example,  in  the  case  where  S/N  equals  10  (bottom  right,  figure  7), 
the  stack  of  all  of  the  individual  receiver  functions  looks  as  good  or  better  than  typical  real  receiver 
functions  but  precise  matching  of  amplitudes  would  lead  to  extremely  inaccurate  estimates  of  Earth 
structure. 

The  Effects  of  Regularization  on  Receiver  Function  Waveforms 

We  consider  what  causes  some  of  the  effects  observed  in  figures  4,  5,  and  6.  For  simplicity  we  will, 
confine  the  remainder  of  our  discussion  of  the  effects  of  regularization  in  the  framework  of  the  time 
domain  deconvolution.  Similar  effects  and  explanations  exist  for  the  frequency  domain  deconvolution. 
We  observed  that  if  too  little  regularization  is  applied,  the  receiver  function  is  plagued  by  high 
frequency  noise,  and  if  too  much  regularization  is  applied,  the  receiver  function  is  plagued  by  ringing 
about  each  arrival.  The  reasons  for  those  effects  are  straightforward.  In  any  inverse  problem,  which  we 
can  write  as  Ax=b,  high  frequency  error  in  the  solution  x  is  due  to  the  excessive  prominence  in  x  of  the 
eigenvectors  associated  with  small  eigenvalues  of  the  A  matrix.  Even  though  we  do  not  solve  the 
matrix  inversion  with  singular  value  decomposition,  or  regularize  by  singular  value  truncation,  it  is 
appropriate  to  discuss  these  effects  in  terms  of  the  eigenvectors  of  matrix,  as  regularization  works  by 
directly  affecting  the  influence  of  eigenvectors. 

Because  the  regularized  problem  is  written  (A^A+W)x=A^b  (Gurrola,  et.  alM  1995),  we  examine 
the  eigenvalues  and  eigenvectors  of  ATA,  rather  than  A.  Figure  8  shows  some  of  the  eigenvectors 
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Figure  3.7:  Comparison  of  a  stack  of  25,  S/N=10,  receiver  functions  (bottom  right)  with  a  synthetic  receiver  function  for  the  same  layer  over  a 
half  space  model  after  convolution  with  the  averaging  function  (averaging  function  is  upper  left  and  convolution  with  synthetic  is  upper  right). 
The  sidelobes  of  the  averaging  function  are  very  much  smaller  than  those  of  the  receiver  function.  A  more  accurate  estimate  of  the  effects  of 
decon-volution  on  the  receiver  function  is  shown  on  the  lower  left  (discussed  in  text). 
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associated  with  the  largest  and  smallest  eigenvalues  of  ATA,  for  the  real  receiver  function  of  figure  1 . 
We  confirm  that  the  smallest  eigenvalues  are  associated  with  very  high  frequency  eigenvectors.  With  a 
condition  number  of  106,  the  smallest  eigenvalues  ATA  will  be  overemphasized  in  the  inverse.  By 
applying  a  higher  degree  of  regularization  (i.e.  increasing  X),  the  smallest  eigenvectors  lose  their 
significance  in  the  inversion  and  so  the  high  frequency  noise  is  damped  and  the  fit  to  the  data  is 
compromised.  When  the  model  size  is  penalized  to  effect  regularization,  the  amplitudes  of  peaks  in  the 
model  are  decreased,  and  in  the  case  of  receiver  functions,  sidelobes  and  prominent  secondary  peaks 
adjacent  to  the  true  peaks  provide  the  next  best  fit  to  the  data.  This  is  observed  in  figures  4,  5,  and  6., 

We  have  found  that  significant  variation  of  amplitudes  (e.g.  figure  2)  occurs  for  different 
reasonable  choices  of  X.  Further,  there  is  no  possible  choice  of  X  that  will  let  us  avoid  all  the  negative 
consequences  of  deconvolution  of  noisy  data.  We  can  attempt  however  to  estimate  the  effect  of 
regularization  on  the  waveforms.  To  that  end  we  investigate  the  use  of  averaging  functions. 

Averaging  Functions  -  Preserving  Absolute  Amplitudes 

Averaging  functions  are  calculated  by  deconvolving  vertical  component  seismograms  from 
themselves,  with  the  same  level  of  regularization  as  applied  to  the  corresponding  receiver  function 
calculation  (Ammon,  1991).  Accurate  initial  peak  amplitudes  are  important  in  constraining  near 
surface  velocities,  so  to  restore  absolute  amplitudes,  receiver  functions  are  normalized  by  the 
maximum  amplitude  of  their  averaging  functions  (Ammon,  1991).  Initial  P  amplitudes  are  accurately 
restored  by  the  application  of  averaging  functions  to  the  noisy  synthetics,  throughout  the  given  range  of 
noise  levels.  In  the  worst  case,  S/N=10,  the  mean  amplitude  of  the  initial  P-arrivals  is  95%  of  the  true 
amplitude  and  nearly  within  two  standard  deviations  of  the  true  value. 

We  note,  however,  that  the  initial  peak  amplitude  is  underestimated  when  the  receiver  functions 
are  stacked.  This  holds  true  even  at  long  period  (i.e.  even  for  receiver  functions  smoothed  with  a 
Gaussian  of  half-width  of  unity).  This  error  occurs  because  of  what  appear  to  be  timing  errors  at  high 
noise  levels.  At  all  noise  levels,  in  stacks  of  receiver  functions,  the  peak  arrival  times  of  major  phases 
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Figure  3.8:  Eigenvectors  of  the  AT  A  matrix  (where  A  is  the  convolution  matrix  of  the  vertical 
seismograms)  for  the  receiver  function  shown  in  figure  1.  The  smallest  eigenvalues  are 
associated  with  high  frequency  oscillations. 
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are  correct,  but  the  arrival  times  of  these  phases  can  be  off  by  several  time  steps  in  individual  receiver 
functions  (we  used  At=0.05  seconds).  For  example,  for  the  S/N=10  synthetics,  the  mean  maximum 
amplitude  for  all  25  receiver  functions  (figure  6)  is  0.422,  95%  of  the  true  amplitude,  but  for  a  stack  of 
the  25  receiver  functions,  the  peak  amplitude  is  0.385,  87%  of  the  true  amplitude  (figure  7).  The  reason 
is  that  the  individual  receiver  function  peaks  don't  quite  line  up  in  time.  For  the  S/N  =10  receiver 
functions,  12  are  aligned,  5  each  are  0.05  seconds  late  and  early,  1  is  0.1  second  early  and  2  are  0.15 
seconds  late. 

This  problem  can  be  avoided  with  a  minor  change  in  how  the  normalization  is  done.  Typically,  the 
maximum  receiver  function  amplitude  is  normalized.  When  the  maximum  peak  position  does  not 
coincide  with  the  expected  initial  peak  position,  the  maximum  value  ends  up  with  the  correct 
normalized  value  and  the  value  at  the  expected  initial  peak  position  is  too  small.  That  results  in  smaller 
amplitudes  than  expected  when  receiver  functions  are  stacked.  If  receiver  function  amplitude  at  the 
expected  initial  peak  position  were  normalized,  then  the  value  of  the  stack's  initial  peak  would  be 
correct.  We  note  that  the  receiver  functions  should  not  be  shifted  in  time  to  align  the  initial  peaks 
before  stacking,  as  the  root  cause  of  the  apparent  timing  errors  are  actually  amplitude  errors.  That  is, 
the  additive  noise  causes  random  errors  in  the  initial  peak  and  near-initial  peak  amplitudes, 
occasionally  causing  the  near-initial  peak  values  to  be  greater. 

Averaging  Functions  -  Estimating  the  Effects  of  Deconvolution  on  the  Waveform 

Cassidy  (1992)  suggests  that  if  sidelobes  are  present  in  the  averaging  functions,  they  could  be 
incorporated  into  the  synthetics  when  modeling  receiver  functions.  There  is  a  fundamental  difference 
however,  between  the  deconvolutions  for  receiver  functions  and  averaging  functions.  How  the  radial 
seismogram  is  mapped  into  the  receiver  function  by  the  inverse  of  vertical  seismogram's  convolution 
matrix  depends  on  the  structure  of  the  data  in  terms  of  the  eigenvectors  of  the  forward  modeling 
matrix.  That  relationship  will  be  very  different,  and  more  complicated,  for  data  that  do  not  fit  the 
model  (i.e.  receiver  functions)  than  for  data  that  identically  fit  the  model  (i.e.  averaging  functions). 
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That  leads  us  to  question  how  good  of  an  estimate  of  the  effects  of  regularization  the  averaging 
function  may  be.  Therefore  we  use  the  noisy  synthetics  to  test  how  well  averaging  functions  estimate 
the  effects  of  deconvolution  on  the  waveform. 

We  calculate  and  stack  the  receiver  functions  and  averaging  functions  for  the  25  synthetic 
seismograms  with  S/N  =  10.  We  use  the  frequency  domain  deconvolution  with  the  modification  for 
preserving  amplitudes  (Ammon,  1991).  We  compare  the  receiver  function  (bottom  right,  figure  6)  with 
an  ideal  synthetic  convolved  with  the  averaging  function  (top  right,  figure  6),  and  find  that  the 
sidelobes  are  much  larger  for  the  receiver  function  than  predicted  by  the  averaging  function. 

We  do  one  further  experiment  with  these  synthetics  to  more  accurately  quantify  the  effect  of 
regularized  deconvolution  on  producing  sidelobes.  When  we  calculate  averaging  functions,  we 
deconvolve  the  vertical  component  synthetic  seismograms  with  noise  added  (figure  3),  from 
themselves.  We  now  calculate  what  we  label  as  a  true  averaging  function  (lower  left,  figure  6),  by 
doing  a  slightly  different  version  of  the  same  thing.  We  make  two  similar  vertical  seismograms  using 
the  same  source  function,  but  using  two  different  segments  of  vertical  seismic  noise,  and  deconvolve 
one  from  the  other.  We  do  this  25  times  and  stack  the  results.  Because  the  additive  noise  is  different  for 
each  pair  of  seismograms  being  deconvolved,  there  is  not  a  precise  fit  to  the  data  as  there  was  in  the 
case  of  the  averaging  function,  and  so  the  situation  of  the  true  averaging  function  is  closer  to  that  of  the 
receiver  function.  The  sidelobes  about  the  initial  peak  of  the  true  averaging  function  are  twice  as  large 
as  they  are  in  the  standard  averaging  function,  and  the  secondary  positive  peaks  (arrows  on  figure  7) 
are  an  order  of  magnitude  larger  than  for  the  standard  averaging  function.  From  this  we  conclude  that 
averaging  functions  provide  a  veiy  optimistic,  minimum  estimate  of  the  waveform  distortion  due  to 
regularization.  This  is  strongly  related  to  the  point  made  in  conjunction  with  figure  6,  that  a  single  level 
of  regularization  will  not  have  the  same  effect  on  two  different  sets  of  data  with  different  noise 
characteristics. 

We  consider  one  final  aspect  of  averaging  functions.  The  averaging  function  is  nominally  only 
valid  for  the  initial  receiver  function  peak.  We  test  the  validity  of  the  estimate  for  other  points  of  a 
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receiver  function  by  considering  the  operation  in  the  time  domain.  The  resolution  about  any  point  in  a 
receiver  function  is  shown  by  the  corresponding  row  of  the  resolution  matrix,  R=(AtA+AI)'1AtA 
(e.g.  Menke,  1989).  That  is,  each  point  of  a  receiver  function  may  be  thought  of  as  having  a  different 
averaging  function.  There  is  no  general  guarantee  that  the  averaging  functions  for  different  points  of  a 
receiver  function  will  be  the  same.  Nonetheless,  for  the  deconvolutions  performed  for  both  the  receiver 
function  modeled  in  Baker  et  al.  (1996)  and  the  more  highly  regularized  noisy  synthetics  of  figure  4, 
the  greatest  difference  in  amplitudes  of  averaging  functions  about  different  points  of  the  receiver 
functions  was  less  than  1  percent  and  the  shape  of  averaging  functions  about  different  points  is  nearly 
indistinguishable.  This  confirms  the  assumption  of  Ammon  (1991)  that  simply  deconvolving  the 
vertical  components  from  themselves,  which  in  the  time  domain  corresponds  to  finding  the  row  of  the 
resolution  matrix  corresponding  to  the  initial  peak  of  the  receiver  function,  is  as  appropriate  for  each 
point  of  the  receiver  function  as  it  is  for  the  initial  peak. 

This  has  no  bearing  on  the  effects  of  regularized  deconvolution  of  noisy  data  in  diminishing  the 
amplitudes  of  secondary  peaks.  The  resolution  matrix,  and  averaging  functions,  are  only  affected  by 
the  assumed  Green's  functions  (the  vertical  seismograms)  and  ignore  how  well  the  data  are  fit,  which  is 
what  will  control  the  amplitudes  of  secondary  peaks. 

Relationship  between  receiver  function  amplitude,  model  velocity,  and  estimated  depths 

Gurrola  and  Minster  (1996)  stack  receiver  functions  from  all  azimuths  using  a  modification  of  the 
velocity  spectrum  stacking  technique  (VSS)  commonly  used  in  reflection  seismology  (Taner  and 
Koehler,  1969).  We  use  that  technique  here  to  examine  the  range  of  models  that  can  explain  the  full 
range  of  teleseismic  data.  We  use  data  from  a  shield  location  with  a  simple  crustal  model  in  order  to 
focus  on  the  trade-off  between  velocity  and  depth  in  arrival  times  and  minimize  possible  complications 
due  to  complex  crustal  structure. 

We  computed  receiver  functions  from  the  same  199  teleseismic  events  (from  300  to  900  distance 
range)  recorded  at  Obninsk,  Russia  (OBN)  used  by  Gurrola  et  al.  (1996).  After  low-pass  filtering  at  0.3 
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Hz,  these  receiver  functions  were  effectively  modeled  with  two  layers  over  a  half  space  (figure  9).  The 
large  number  of  events,  distribution  of  source  regions,  and  simplicity  of  the  model  make  this  an  ideal 
station  for  a  discussion  of  the  trade-off  between  velocity  and  depth  in  modeling  receiver  structure. 
Figure  10  is  a  VSS  produced  from  the  OBN  receiver  functions.  Each  point  in  a  VSS  is  produced  by 
applying  moveout  corrections  appropriate  for  a  particular  phase  to  each  receiver  function,  and  stacking 
them.  The  goal  is  to  identify  the  velocity  models  that  produce  the  largest  stacked  amplitude  for  the 
target  phase.  The  points  in  the  VSS  in  figure  10  represent  the  combinations  of  Vp,  Vs,  and  depth  that 
produce  similar  "largest"  amplitudes  (within  95%  confidence  levels  determined  by  a  bootstrap 
estimate)  when  events  from  the  entire  teleseismic  range  are  stacked  after  correction  for  moveout 
(Gurrola  and  Minster,  1996).  The  plot  on  the  upper  left  is  a  3-D  view  of  the  models  that  fit  the  arrival 
times  equally  well.  The  other  three  images  in  figure  10  are  the  same  VSS,  but  viewed  along  each 
coordinate  axis.  Velocities  are  represented  as  a  percentage  of  PREM,  and  are  allowed  to  vary  up  to 
10%  from  that  model,  which  is  more  than  twice  the  range  found  by  virtually  all  regional  and  global 
velocity  models  (Nolet  et  al.,  1994).  Such  a  range  of  Vp  and  Vs  provide  a  20  km  range  of  equally 
acceptable  model  depths  (figure  10,  upper  left) . 

We  examine  the  relationship  between  perturbations  in  model  parameters  to  clarify  the  basis  of  the 
variations  observed  in  figure  10.  The  moveout  correction  is  given  by 


where  t  is  the  time  delay  of  the  Ps  phase  relative  to  the  direct  P  wave  (moveout  correction),  z  is  depth, 
Vs  is  the  S- velocity,  Vp  is  the  P- velocity,  and  p  is  the  ray  parameter.  By  differentiating  equation  1  with 
respect  to  z,  Vs,  and  Vp,  we  arrive  at  an  expression  describing  the  relationship  between  small 
perturbations  in  these  parameters  that  produce  similar  moveout  corrections. 


depth  (km)  _  depth  (km) 


Figure  3-10:  VSS  for  data  recorded  at  OBN.  Upper  left,  view  in  3-space.  Upper  right,  view  in 
Vp-depth  space.  Lower  left,  view  in  Vs-depth  space.  Lower  right,  view  in  Vp-Vs  space. 
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For  convenience  Gurrola  and  Minster  (1996)  represent  perturbations  in  these  model  parameters  as 
a  ratio  of  the  perturbation  in  the  model  parameters  and  a  fixed  value  of  each  ,  resulting  in 

-1.73dMs  +  dMp  +  0.73  dMz  =  0,  or 

07^^  (3) 

z  0.73 

where  dMp  =dVp/Vp,  dMs  =dVs/Vs  and  dMz  =dz/z)  and  assuming  a  Poisson  solid  (Vp/Vs  =V3).  From 
equation  3  we  see  that  a  correlated  10%  spread  of  possible  values  for  Vp  and  Vs  about  the  expected 
value  will  result  in  a  10%  change  in  depth  (4.4  km)  about  the  expected  depth,  which  is  smaller  than  the 
10  km  spread  of  depth  estimates  about  the  mean  value  of  44  km  seen  in  figure  10.  Equation  3  indicates 
that  a  10  km  spread  in  depth  estimates  would  require  a  6%  spread  in  anti-correlated  P  and  S  velocities, 
which  is  consistent  with  the  distribution  of  plausible  velocity  models  in  figure  10. 

For  near  vertical  ray  paths,  this  trade-off  between  depth  and  velocity  in  computing  arrival  times 
can  not  be  overcome  by  simultaneously  fitting  the  Ps,  P2pls  and  Plp2s  phases.  We  use  the  notation  of 
Gurrola  and  Minster  (1996),  in  which  P2pls  indicates  a  reverberating  phase  with  two  P-wave  path 
segments  between  the  discontinuity  and  the  free  surface  and  one  S-wave  leg.  Similarly,  Plp2s  has  one 
P-wave  leg  and  two  S-wave  legs.  Figure  1 1  shows  the  receiver  functions  computed  from  the  models 
given  in  that  figure,  at  ray  parameters  of  0.04,  0.06,  and  0.08.  For  this  example  we  return  to  a 
simplified  version  of  the  model  used  in  the  discussion  of  uncertainty  in  receiver  function  amplitudes 
and  time  delays.  The  model  is  shown  by  the  solid  line  in  figure  1 1  (upper  left),  and  includes  a  30  km 
thick  layer  (Vp  =  6.5  km/sec  and  Vs  =  3.8  km/sec)  over  a  half  space  (with  Vp  =  8.0  km/sec  and 


71 


Vs  =  4.6  km/sec).  The  two  models  shown  by  dashed  lines  were  computed  for  velocities  and  thicknesses 
in  the  top  layer  ±10%  of  those  given  by  the  solid  line.  We  can  see  that  the  delay  times  for  the  Ps,  P2pls 
and  Plp2s  are  nearly  unaffected  by  a  uniform  change  in  model  parameters.  For  vertical  incidence  the 
travel  time  of  each  of  these  phases  is  given  by 


(4) 


TPW*=2VS 


where  TPs,  TP2pls  and  TPlp2s  are  the  respective  arrival  times  of  the  Ps,  P2pls,  and  Plp2s  phases 
relative  to  the  initial  arrival.  The  ray  path  for  a  vertically  incident  wave  will  be  the  same  for  a  P-  and  an 
S-wave,  in  which  case  we  can  see  that  a  correlated  increase  or  decrease  in  Vp,  Vs,  and  z  will  result  in 
no  change  in  each  of  these  arrival  times.  From  figure  1 1  (upper  right  and  lower  left),  we  see  that  the 
arrival  times  behave  very  much  like  those  of  a  vertically  incident  wave  for  the  0.04  sec/km  and  0.06 
sec/km  ray  parameters  (which  covers  half  the  teleseismic  band).  For  the  0.08  sec/km  ray  parameter 
(figure  1 1,  lower  right),  there  is  a  small  difference  in  the  P2pls  arrival  times  which  is  perceptible  in 
noise  ffee-delta  function  synthetics,  but  would  be  less  than  the  perceptible  limits  discussed  above  for 
realistic  synthetics  and  observed  data.  Figure  12  illustrates  that  just  a  2%  change  in  S-wave  velocity  is 
sufficient  to  remove  the  difference  observed  between  those  synthetics.  The  dashed  line  model  of  figure 
12  has  a  2%  lower  S-wave  velocity  than  the  corresponding  model  of  figure  11  (4.123  km/sec  vs. 
4.223).  That  difference  is  less  than  the  expected  resolution  of  any  receiver  function  study.  Figure  12 
(right  side)  shows  that  the  P2pls  arrival  times  are  indistinguishable  between  those  two  models. 

Clarke  and  Silver  (1993)  also  discuss  the  problems  in  determining  a  unique  velocity-depth  model 
from  near  vertically  incident  phases.  They  point  out  that  we  at  best  can  find  a  likely  model  defined  by 
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Figure  3.1 1:  The  simple  velocity  model  (upper  left)  adopted  for  demonstration  of  the  effects  of  moveout  with  ray  parameter  (solid  line) 
and  ±10%  perturbations  to  that  model  (dashed  lines).  Receiver  functions  calculated  for  the  models  for  ray  parameters  of  0.04  (upper 
right),  0.06  (lower  left),  and  0.08  (lower  right).. 
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Poisson's  ratio  and  depth.  To  do  so  accurately,  it  is  necessary  to  use  more  primary  phases  than  typically 
used  in  a  receiver  function  study.  Specifically,  they  used  P-to-S  phases,  S-to-P  precursors  and 
reverberations  (converted  and  direct)  of  several  different  teleseismic  phases  (e.g.  P,  S,  ScS,  PcP  and 
SkS),  thereby  enriching  the  available  distribution  of  ray  parameter. 

Other  Sources  of  Uncertainty 

As  large  as  the  Ps/P  amplitude  ratio  variation  is  for  receiver  functions  of  synthetic  seismograms 
with  realistic  signal  to  noise  ratios,  it  is  even  greater  for  the  receiver  functions  calculated  from 
observed  data.  This  indicates  that  additive  noise  is  not  the  only  important  source  of  uncertainty  in 
receiver  function  amplitudes.  Greater  sources  of  error  than  additive  noise  are  likely  signal-generated 
noise  and  inadequacies  in  our  physical  assumptions  (e.g.  topographic  effects,  anisotropy,  3-D 
structure,  small  scale  scatterers  in  the  crust).  Imprecision  in  the  assumption  that  the  energy  follows  the 
great  circle  path  can  lead  to  errors  in  amplitudes  as  a  result  of  misrotation  of  the  seismograms  before 
deconvolution.  Baker  et  al.  (1996)  demonstrate  that  this  is  a  significant  problem  in  regions  of  tectonic 
complexity  such  as  PFO.  Anisotropy  observed  beneath  shield  regions  (Silver  et  al.,  1988)  can  also 
cause  raypath  bending  away  from  the  great  circle  path. 

Error  in  Receiver  Function  Amplitudes  due  to  Scattering  along  the  Paths  of  Different  Phases 

Ps  and  P  waves  sample  different  segments  of  the  crust,  and  array  studies  suggest  that  amplitudes 
of  similar  steep  arrivals  are  extremely  sensitive  to  small  variations  in  their  different  lithospheric  paths. 
For  example,  Haddon  and  Husebye  (1978)  observed  significant  spatial  variations  of  teleseismic  P 
amplitudes  recorded  at  the  NORSAR  array,  which  they  attributed  to  small  lithospheric  lateral  velocity 
heterogeneities.  The  amplitudes  fluctuated  much  more  rapidly  across  the  array  than  did  the  travel 
times.  We  have  seen  that  receiver  function  peak  arrival  times  are  much  less  sensitive  than  amplitudes 
to  error  introduced  by  additive  noise  and  the  processing.  This  indicates  that  the  receiver  function  peak 
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arrival  times  are  also  more  robust  to  small  scale  velocity  heterogeneities  in  the  Earth,  and  so  are  more 
appropriate  for  making  inferences  about  major  discontinuities. 

At  the  LASA  array,  PcP  and  P  amplitudes  were  observed  to  be  uncorrelated  for  the  same  events, 
even  though  P  amplitude  variations  for  different  events  several  degrees  apart  are  highly  correlated.  The 
PcP  variations  for  the  same  sets  of  events  several  degrees  apart  are  also  highly  correlated  (Frasier  and 
Chowdhury,  1974).  The  authors  determined  that  the  lack  of  correlation  between  P  and  PcP  is  due  to 
near  receiver  scattering  by  complex  crustal  and  upper  mantle  structure.  This  is  especially  relevant  to 
the  interpretation  of  receiver  function  amplitudes,  because  the  difference  in  incidence  angle,  and  so 
crustal  raypaths,  between  P  and  PcP  for  events  recorded  at  LASA  is  the  same  as  the  difference  between 
P  and  Ps  phases  seen  in  receiver  functions.  For  example,  for  a  600  distant  surface  source  recorded  at 
LASA,  assuming  a  6  km/sec  P-wave  velocity  and  a  Poisson  solid  near  the  receiver,  the  P-wave 
incidence  angle  is  21.90,  and  both  the  PcP  and  Ps  incidence  angles  are  12.40.  Because  the  Ps  and  PcP 
paths  vary  in  the  same  way  from  the  P-wave  path,  we  should  expect  to  see  similar  disparate  effects  on 
Ps  and  P  amplitudes  to  those  seen  in  PcP  and  P  amplitudes,  providing  another  indication  that  inference 
of  precise  interface  dips  or  velocity  jumps  from  relative  Ps  and  P  amplitudes  would  be  dubious. 

Errors  in  Model  Parameters  due  to  Inappropriate  Assumptions  of  Incidence  or  Backazimuth 

In  areas  of  complex  structure  there  is  often  a  difference,  called  the  mislocation  vector  (Davies  and 
Sheppard,  1972),  between  measured  and  predicted  incidence  angles  and  backazimuths.  In  fact, 
mislocation  vectors  may  be  significant  enough  to  be  used  for  inferring  lateral  velocity  variations  (e.g. 
Walck  and  Minster,  1982,  Powell  and  Mitchell,  1994).  We  consider  the  sensitivity  of  receiver  function 
amplitudes  to  such  mislocation  vectors.  For  the  optimistic  case  of  a  difference  between  predicted  and 
observed  incidence  angles  of  3  degrees,  for  the  very  simple  layer  over  a  half  space  model  used  earlier, 
there  is  an  approximately  16%  percent  difference  in  the  initial  P  amplitude  and  a  19%  difference  in  the 
Moho  Ps  amplitude  (figure  13).  On  the  other  hand,  there  is  only  a  3%  difference  in  the  Moho  Ps/P 
amplitude  ratio.  This  suggests  that  the  mislocation  vectors  should  be  examined  for  events  from  which 
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receiver  functions  are  calculated,  and  especially  if  they  are  large,  that  modeling  Ps/P  amplitude  ratios 
will  lead  to  much  smaller  errors  than  modeling  absolute  amplitudes. 

If  structure  in  the  region  modeled  is  not  horizontally  layered,  the  difference  between  expected  and 
predicted  backazimuth  (the  tangential  component  of  mislocation  vectors)  can  be  large  and  can  cause 
further  misinterpretations  of  receiver  functions.  Rotation  of  the  horizontal  component  seismograms  to  a 
predicted  backazimuth  that  differs  from  the  true  backazimuth  of  energy  incident  on  a  dipping  interface 
produces  only  negligible  errors  in  the  absolute  P  amplitude  of  the  radial  receiver  functions,  but  can 
cause  much  larger  errors  in  converted  phases,  and  even  larger  errors  in  amplitudes  of  reverberations 
(e.g.  Cassidy,  1992).  On  the  tangential  receiver  functions,  such  misrotation  can  cause  polarity  reversals 
of  some  phases. 

Conclusions 

Analysis  of  the  uncertainties  in  receiver  functions  using  both  real  data  and  synthetics  indicates  that 
peak  arrival  times  are  a  reliable  source  of  information  about  the  Earth.  Peak  arrival  times  do  not  vary 
significantly  with  noise  level  or  with  the  method  of  deconvolution  or  the  level  of  regularization. 
Amplitudes  are,  however,  very  sensitive  to  the  effects  of  both  noise  and  the  deconvolution.  Noise  not 
only  contributes  to  the  very  large  statistical  uncertainty  of  receiver  function  amplitudes,  but  can  bias 
the  regularized  deconvolution  toward  smaller  amplitudes.  The  bias  can  be  the  source  of  much  larger 

error  than  that  indicated  by  the  statistical  uncertainty. 

The  interpretation  of  receiver  function  amplitudes  is  also  very  sensitive  to  errors  in  the  assumed 
ray  parameter  and  backazimuth,  and  is  likely  sensitive  to  small  differences  in  velocity  structure 
between  the  P  and  Ps  ray  paths.  These  observations  all  indicate  that  the  most  reliable  value  of  receiver 
functions  lies  in  the  identification  of  discontinuities  and  the  appropriate  depth-velocity  curve.  Without 
outside  constraints  on  velocities,  depths  cannot  be  constrained.  We  have  shown  that  the  errors  in  depth 
are  of  the  same  order  as  the  errors  in  velocity,  so  accurate  a  priori  velocity  information  will  make  it 
possible  to  accurately  constrain  depths. 
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The  large  error  associated  with  receiver  function  amplitudes  renders  dubious  the  precise  estimation  of  a 
velocity  contrast,  or  the  precise  dip  of  a  discontinuity.  That  lack  of  precision  is  exacerbated  by  the 
large  number  of  parameters  to  which  the  amplitude  is  sensitive,  including  the  velocities  and  densities 
above  and  below  the  interface  (Ammon,  1990),  the  degree  and  direction  of  the  dip  of  the  interface,  and 
the  incidence  angle  and  backazimuth  of  energy  arriving  at  the  interface  (Baker  et  al.  1996).  Amplitudes 
should  therefore  only  be  used  to  corroborate  inferences  of  approximate  dip  directions,  relative  dips  of 

interfaces,  or  relative  magnitudes  of  velocity  contrasts. 

It  might  seem  that  our  criticism  of  the  technique  that  we  have  relied  on  so  heavily  in  Baker  et  al. 
(1996)  is  unduly  pessimistic.  On  the  contrary,  it  is  this  careful  examination  of  the  pitfalls  of  receiver 
function  waveform  modeling  that  gives  us  a  high  degree  of  confidence  in  the  interpretations  made.  The 
knowledge  that  peak  arrival  times  are  more  robust  than  amplitudes  legitimizes  the  approach  that 
focuses  on  modeling  arrival  times.  In  Baker  et  al.  (1996),  it  was  the  careful,  critical  analysis  of 
mislocation  vectors  that  indicated  the  one  quadrant  for  which  we  could  at  least  speculate  regarding  the 
tangential  component  receiver  functions,  confident  that  they  were  not  an  artifact  of  misrotation  of  the 
horizontal  seismograms. 

Receiver  functions  are  a  powerful  technique,  which  provide  information  about  the  existence, 
relative  strengths,  and  position  in  depth- velocity  space,  of  shear  wave  discontinuities  beneath  a  3- 
component  seismic  station. 


79 


References 


Ammon,  C .1,  G.  Randall,  and  G.  Zandt  (1990),  On  the  nonuniqueness  of  receiver  function  inversions, 
J.  Geophys.  Res.,  95, 15,303-15,318 

Ammon,  C.J.  (1991),  The  isolation  of  receiver  effects  from  teleseismic  P  waveforms,  Bull  Seism.  Soc. 
Am  81.2504-2510 

Baker,  G.E.,  J.B.  Minster,  G.  Zandt,  and  H.  Gurrola  (1996),  Constraints  on  crustal  structure  and 
complex  Moho  topography  beneath  PiOon  Flat,  California,  from  teleseismic  receiver  functions, 
Bull  Seism.  Soc.  Am ,  96,  (in  press) 

Bostock,  M.  G.  (1996),  Ps  conversions  from  the  upper  mantle  transition  zone  beneath  the  Canadian 
landmass,  J.  Geophys.  Res. ,  101,  8383-8402 

Burdick,  L.  J.,  and  C.  A.  Langston  (1977),  Modeling  crustal  structure  through  the  use  of  converted 
phases  in  the  teleseismic  body  waveforms,  Bull  Seism.  Soc.  Amer. ,  67,  677-692 

Cassidy,  J.F.  (1992),  Numerical  experiments  in  broadband  receiver  function  analysis  (1992),  Bull 
Seism.  Soc.  Am.  82,  1453-1474 

Cassidy,  John  F.  and  R.M.  Ellis  (1994),  S  wave  velocity  structure  of  the  northern  Cascadia  subduction 
zone,  J.  Geophys.  Res.,  98, 4407-4421 

Clarke,  T.  J.,  and  P.  G.  Silver  (1993),  Estimation  of  crustal  Poisson  ratios  from  broad  band  teleseismic 
data,  Geophys.  Res.  Let.,  20,  241-244 

Davies,  D.,  and  R.M.  Sheppard  (1972),  Lateral  heterogeneities  in  the  Earth's  mantle,  Nature ,  239,  318 

Efron,  B.E.  (1982),  The  jackknife,  the  bootstrap  and  other  resampling  plans,  pub.  Society  for  Industrial 
and  Applied  Mathematics 

Efron,  B.E.  (1979),  Bootstrap  methods:  another  look  at  the  jackknife,  Ann.  Statis.,  7,  1-26 

Frasier,  C.W.  and  D.K.  Chowdhury  (1974),  Effect  of  scattering  on  PcP/P  amplitude  ratios  at  Lasa  from 
400  to  840  distance,  J.  Geophys.  Res.,  79,  5469-5477 

Gurrola  H.G.,  G.E.Baker,  and  J.B.Minster  (1995),  Simultaneous  time  domain  deconvolution  with 
application  to  receiver  functions,  Geophys.  J.  Inti ,  120,  537-543 

Gurrola,  H.,  and  B.  Minster,  Thickness  estimates  of  the  upper  mantle  transition  zone  from  bootstrapped 
velocity  spectrum  stacks  of  receiver  functions,  submitted  to  Geophys.  J.  Inti,  Sept.,  1996 

Haddon,  R.A.W.  and  E.S.  Husebye  (1978),  Joint  interpretation  of  P-wave  time  and  amplitude 
anomalies  in  terms  of  lithospheric  heterogeneities,  55,  19-43 

Kind,  R.  and  L.  P.  Vinnik  (1988),  The  upper-mantle  discontinuities  underneath  the  GRF  array  from  P- 
to-S  converted  phases,  J.  Geophys.,  62,  138-147 

Langston,  C.A.,  and  D.  Helmberger  (1975),  A  procedure  for  modeling  shallow  dislocation  sources, 
Geophys.  J .  Inti.,  42,  117-130 

Langston  C.A.  (1979),  Structure  under  Mount  Rainier,  Washington,  inferred  from  teleseismic  body 
waves,  J.  Geophys.  Res.,  84, 4749-4762 


80 


Menke,  W.  (1989),  Geophysical  data  analysis:  Discrete  Inverse  Theory,  Academic  Press  Inc. 

Nolet,  G.,  S.  P.  Grand,  and  B.  L.  N.  Kennet  (1994),  Seismic  heterogeneity  in  the  upper  mantle,  J. 
Geophys.  Res.,  99,  23,753-23,766 

Oldenburg,  D.W.  (1981),  A  comprehensive  solution  to  the  linear  deconvolution  problem,  Geophys. 
J.R.  Astr.  Soc.,  65,  331-357 


Owens  T.J.  G.  Zandt,  and  S.R.  Taylor  (1984),  Seismic  evidence  for  an  ancient  rift  beneath  the 
Cumberland  Plateau,  Tennessee:  a  detailed  analysis  of  broadband  teleseismic  P  waveforms,  J. 
Geophys.  Res.,  89,  7783-7795 

Powell,  C.,  and  B.  Mitchell  (1994),  Relative  array  analysis  of  the  southern  California  lithosphere,  J. 
Geophys.  Res.,  99,  15,257-15,275 

Silver,  P.G.  and  W.W.  Chan  (1988),  Implications  for  continental  structure  and  evolution  from  seismic 
anisotropy,  Nature,  335,  34-39 

Taner,  M.  T.  and  F.  Koehler  (1969),  Velocity  spectra:  digital  computer  derivation  and  applications  of 
velocity  functions,  Geophys.,  34,  859-881 

Vinnik,  L.  P.  (1977),  Detection  of  waves  converted  from  P-to-SV  in  the  mantle,  Phys.  Earth  Planet. 
Inter.,  15, 294-303 

Walck,  M.  and  J.B  .Minster  (1982),  Relative  array  analysis  of  upper  mantle  lateral  velocity  variations  in 
southern  California,  J.  Geophys.  Res.,  87, 1757-1772 


Chapter  4 


The  role  of  seismology  in  monitoring  nuclear  testing 


Introduction 

Nuclear  weapons  constitute  one  of  the  most  obvious  threats  to  the  continuation  of  human  society, 
and  since  their  development,  efforts  have  been  made  to  curb  their  proliferation.  Early  tests  of  nuclear 
weapons  by  the  United  States  and  the  Soviet  Union  were  conducted  in  the  atmosphere  until  the  Partial 
Test  Ban  Treaty  was  signed  by  the  U.S.  and  the  U.S.S.R.  in  1963,  prohibiting  nuclear  weapons  testing 
anywhere  but  underground.  Seismology  then  became  the  critical  tool  for  monitoring  nuclear 
explosions,  and  annual  funding  for  seismology  increased  25-fold  from  the  1950’s  to  the  1960’s,  driving 
rapid  development  of  the  field.  The  early  focus  was  on  establishing  a  global  network  for  the  detection 
of  nuclear  explosions,  their  discrimination  from  earthquakes,  and  determination  of  their  locations. 
Discrimination  began  simply  with  an  examination  of  P-wave  first  motions,  and  advanced  with  the 
recognition  that  seismograms  from  nuclear  explosion  sources  were  deficient  in  S-wave  energy 
compared  to  earthquakes,  and  deficient  in  surface  wave  energy  relative  to  shallow  earthquakes. 

The  Threshold  Test  Ban  Treaty,  signed  in  1974,  limited  the  yield  of  nuclear  explosions  to  150  kt, 
and  the  size  the  explosions  became  a  focus  of  attention.  It  was  recognized  that  where  continental  paths 
are  available,  the  regional  phase  Lg  provides  an  accurate  magnitude  and,  therefore,  yield  estimate  (e.g., 
Nuttli,  1973,  Baumgardt,  1984,  Ringdal  and  Hokland,  1987). 

Since  1992,  the  U.S.,  most  other  western  nuclear  powers,  and  Russia  have  adhered  to  a 
moratorium  on  nuclear  testing,  and  France  has  joined  after  completing  a  series  of  tests  this  year.  All  are 
involved  in  negotiating  a  Comprehensive  Test  Ban  Treaty  (CTBT),  which  would  ban  all  nuclear 
testing.  China  has  also  declared  its  interest  in  signing  a  CTBT,  once  it  completes  an  ongoing  series  of 
tests.  All  the  while  that  efforts  to  limit  the  development  of  weapons  among  declared  nuclear  powers 
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were  continuing,  a  parallel  effort  was  underway  to  limit  the  spread  of  such  weapons.  The  Non- 
Proliferation  Treaty,  (NPT)  signed  in  1968,  prohibited  nuclear  weapon  possession  by  all  signatories 
except  the  declared  nuclear  weapons  states.  Discussions  on  extension  of  the  NPT  are  ongoing,  with 
their  resolution  being  dependent  on  the  outcome  of  CTBT  talks.  Seismology  will  remain  one  of  the 
most  powerful  tools  available  for  monitoring  both  a  CTBT  and  the  NPT,  with  the  focus  on  reducing  the 
levels  at  which  detection,  location,  and  discrimination  are  feasible.  This  will  be  achieved  by  the 
expansion  of  global  seismic  coverage  (e.g.  Simpson  et  al.,  1996;  Harjes,  1996).  This  effort  will  involve 
analyzing  regional  seismograms  from  areas  of  the  globe  for  which  seismicity  and  seismic  propagation 
have  not  previously  been  characterized.  Thus,  improving  the  effectiveness  of  regional  seismic 
monitoring  has  become  an  important  goal. 

The  second  half  of  this  thesis  deals  with  one  of  the  technical  issues  important  to  this  global  effort; 
that  is,  improving  the  performance  of  nuclear  discriminants  at  the  regional  level.  Specifically,  the 
research  focuses  on  improved  understanding  of  the  effects  of  path  properties,  such  as  waveguide 
thickness,  slope,  and  roughness,  on  Lg  propagation.  Quantifying  such  effects  is  the  essential  first  step 
toward  understanding  them,  and  eventually,  to  producing  path  corrections  for  regional  seismic 
discriminants  that  are  transportable  (i.e.  that  may  be  applied  in  a  region  other  than  where  they  were 
derived). 

Errors  in  regional  nuclear  discriminants  due  to  path  effects 

Earthquakes  and  nuclear  explosions  are  very  different  types  of  seismic  sources  (e.g.  Mueller  and 
Murphy,  1971;  Stevens  and  Day,  1985),  and  discrimination  between  them  would  be  much  simpler, 
almost  foolproof,  if  there  were  seismic  instruments  very  near  all  seismic  sources.  That  being 
impractical,  we  must  use  seismograms  recorded  anywhere  from  hundreds  to  thousands  of  kilometers 
from  source  epicenters.  The  discriminants  measure  the  differences  between  energy  of  different  types, 
or  between  energy  in  different  frequency  bands,  that  should  reflect  corresponding  differences  at  the 
source.  Intrinsic  and  scattering  attenuation,  and  conversion  of  energy  from  one  type  to  another  occur 
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during  propagation  and  can  cause  significant  errors  in  regional  seismic  discriminants.  The  most 
effective  discriminants  are  those  that  are  in  some  way  “path-proof’.  For  example,  the  Lg/Pg  amplitude 
ratio  is  a  more  accurate  measure  of  source  parameters  than  Lg/Pn  amplitude  ratios,  because  the  energy 
in  Lg  and  Pg  phases  travels  in  the  crustal  waveguide  and  so  is  subject  to  similar  changes  in  the  path  of 
propagation.  Pn,  on  the  other  hand,  travels  mostly  in  the  upper  mantle  where  it  is  subject  to  completely 
different  path  variations.  In  fact,  in  a  test  of  all  single  regional  high  frequency  discriminants,  the  Lg/Pg 
amplitude  ratio  was  found  to  be  the  most  effective  (Taylor  et  al.,  1989).  There  is,  however,  still 
significant  error  associated  with  the  Lg/Pg  amplitude  ratio  discriminant  (e.g.  Taylor,  et  al.,  1989). 

Discrimination  can  be  improved  in  a  number  of  ways.  Denser  coverage  by  better  instruments  in 
better  sites  is  always  desirable.  More  accurate  discriminants  may  still  be  developed.  If  the  errors  in  the 
current  regional  discriminants  are  in  any  way  systematic,  they  may  also  be  reduced.  It  is  this  last  option 
that  we  will  pursue. 

To  improve  discrimination,  we  would  like  to  know  whether  misclassified  events  have  anything  in 
common.  Previous  research  (e.g.  Baumgardt,  1985;  Zhang  et  al.,  1994)  has  shown  correlations  of  Lg 
amplitude  variation  with  features  along  the  propagation  path.  Thus,  it  is  reasonable  to  expect  some 
systematic  geographic  variation  in  the  amplitude  ratios.  We  test  this  idea  using  data  from  the  southern 
California  seismic  network  (SCSN).  For  reasons  detailed  below,  the  southern  California  seismic 
network  provides  an  ideal  laboratory  for  improving  regional  discrimination.  To  test  the  spatial 
correlation  of  misclassified  events,  we  plot  the  log(Lg/Pg)  amplitude  ratio  discriminant  at  each  SCSN 
station  recording  a  set  of  regional  events,  both  earthquakes  and  nuclear  explosions  from  the  Nevada 
Test  Site  (NTS)  (figure  1).  For  consistency  with  previous  studies  the  data  are  filtered  to  match  the 
WWSSN  short-period  instrument  response  and  the  peak  amplitudes  of  Lg  and  Pg  are  measured.  Later, 
to  ensure  somewhat  greater  robustness  of  measurements,  we  use  rms  measurements  of  Lg  and  Pg 
amplitudes.  The  data  then  are  filtered  from  0.6  to  3  Hz  to  maximize  observation  of  Lg  by  avoiding 
most  longer  period  fundamental  mode  Rayleigh  wave  energy  and  the  sometimes  overwhelming  level 
of  higher  frequency  noise.  We  find,  however,  that  discrimination  results  are  virtually  identical  for  the 


Figure  4.1:  Peak  log(Lg/Pg)  amplitude  ratios  for  a  subset  of  SCSN  data,  corrected  to  the  WWSSN 
response.  Earthquake  records  are  plotted  as  crosses  and  explosion  records  are  plotted  as  circles. 
The  discrimination  line  separating  earthquakes  and  explosions  matches  that  found  by  Taylor  et.  al. 
(1986)  for  the  western  U.S.  We  will  address  whether  the  scatter  and  misidentification  (i.e.  symbols 
on  the  "wrong  side"  of  the  line)  is  due  to  identifiable  propagation  effects. 
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peak  and  rms  amplitude  measurements.  The  events  and  SCSN  stations  are  shown  in  figure  2.  Indeed,  a 
strong  geographic  pattern  is  observed  across  the  SCSN  for  NTS  events  and  several  earthquakes  at 
different  locations  (figures  3-13).  The  problem  now  becomes  a  search  for  the  cause  of  the  patterns  of 
variations  observed. 

The  pattern  of  variation  in  Lg/Pg  amplitude  ratios  is  similar  for  10  NTS  events  whose  records  we 
have  examined  (figures  3,  4  ,  and  5).  The  sources  span  50  km  in  epicentral  distance,  indicating  that 
very  near  source  scattering  is  not  responsible  for  the  Lg/Pg  variation.  A  similar  pattern  is  even 
observed  for  a  shallow  earthquake  that  occurred  within  NTS  (figures  6  and  7),  indicating  that  the 
source  radiation  pattern  is  also  not  likely  a  significant  factor.  Figures  8, 9,  and  10  indicate  that  distance, 
near  receiver  scattering,  and  site  effects  are  not  primary  causes  of  the  variation.  We  also  see  no 
correlation  between  nodes  in  the  P  radiation  pattern  (either  predicted  by  known  mechanisms  or 
observed  in  first  arrival  polarizations)  and  maxima  in  Lg/Pg  ratios  (figures  11-13),  strengthening  the 
previous  indication  (figures  6  and  7)  that  the  source  radiation  pattern  does  not  dominate  the  observed 
geographical  distribution.  By  elimination  of  other  possibilities,  we  conclude  that  geographic  variations 
in  Lg/Pg  amplitude  ratios  result  from  differences  in  structure  to  which  the  regional  phases  are  sensitive. 
Next  we  consider  what  may  cause  the  variations  along  the  propagation  paths. 

To  determine  what  propagation  effects  cause  Lg/Pg  amplitude  ratios  to  vary,  it  would  be  helpful  to 
have  two  types  of  information  that  are  rarely  available.  One  is  seismic  data  from  a  very  densely  spaced 
network  that  spans  the  type  of  structures  thought  to  cause  Lg  attenuation  and  blockage.  The  second  is 
knowledge  of  the  amplification  at  all  station  sites.  We  discuss  the  reasons  for  wanting  both  of  these  in 
the  following  sections. 

The  importance  of  fine  scale  observations 

Lg  blockage  by  oceanic  crust  was  recognized  by  the  first  researchers  to  describe  Lg  (Press  and 
Ewing,  1952).  Insight  into  the  physical  processes  behind  seismic  observations  has  been  gained  by 
synthetic  modeling,  despite  the  many  simplifying  assumptions  required  to  make  the  analytical  and 


Log(Lg/Pg) 


Earthquakes  (+)  and  Explosions  (o) 


Mb 


Figure  4.2:  Map  of  southern  California  seismic  network  stations  (triangles)  with  regional 
earthquakes  (asterisks)  and  explosions  (circles)  that  are  discussed  in  the  chapter. 
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identification  of  the  source  as  an  earthquake.  Circles  indicate  identification  as  an  explosion.  Symbols  are  scaled  by  their  distance  from  the 
discrimination  line  of  figure  1.  The  smaller  explosions,  Floydada  (1991,  day  227,  upper  left)  and  Coso  (1991,  day  67,  upper  right),  are 
misclassified  more  frequently  than  the  larger  explosions,  Lubbock  (1991,  day  291,  lower  left)  and  Hoya  (1991,  day  257,  lower  right). 

The  sampling  is  biased  because  of  the  limited  dynamic  range  of  the  instruments,  leading  to  clipped  records  at  short  distances  for  large 
events,  and  low  signal  to  noise  level  records  at  long  distances  for  small  events. 
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Figure  4.4:  To  examine  possible  geographic  variation  in  the  pattern  of  log(Lg/Pg)  amplitudes,  we 
standardize  the  distribution  of  each  event’s  discriminant  values  (i.e.  remove  the  mean  and  normalize 
so  that  one  standard  deviation  equals  1.0).  The  result  above  is  for  Hoya  (figure  3d).  Here,  symbol  size 
is  scaled  by  proximity  to  the  mean,  with  crosses  positive  and  circles  negative.  Note  that  the  crosses 
appear  to  be  in  the  same  region  in  which  misclassifications  occurred  for  the  smaller  explosions. 


Figure  4.5:  The  normalization  described  in  figure  4  was  performed  for  10  nuclear  explosions.  The 
mean  for  each  station,  for  all  the  events  it  recorded,  was  then  plotted  above,  revealing  a  distinct 
geographic  pattern.  Although  we  have  ignored  the  possible  effects  of  the  bias  in  sampling  (nearer 
stations  recording  more  small  events  and  more  distant  stations  recording  larger  events),  inspection 
of  results  for  individual  events  suggests  that  the  pattern  is  robust.  That  the  pattern  appears  to  be 
common  to  all  events,  whose  sources  span  50  km  within  NTS,  argues  against  near  source  scattering 
strongly  affecting  the  discriminant  values. 


Figure  4.6:  Log(Lg/Pg)  amplitude  ratios  for  earthquake  no.  4  (Mb  =  3.8)  of  figure  2.  As  in  figure  3, 
crosses  are  "earthquake-like"  values  scaled  by  distance  above  the  discrimination  line  of  figure  1, 
and  circles  represent  "explosion-like”  values  scaled  by  distance  below  the  line.  For  this  shallow 
earthquake  (~  2  km  depth),  100  out  of  150  recordings  misclassify  the  event  as  an  explosion.  This 
serves  as  a  warning  that,  even  if  we  develop  perfect  path  corrections,  there  will  be  anomalous  events. 


Figure  4.7:  Normalized  log(Lg/Pg)  amplitude  ratios,  as  in  figure  4,  for  earthquake  number  4  (figures 
2  and  6).  The  relative  pattern  of  large  and  small  Lg  to  Pg  amplitude  ratios  is  quite  similar  to  that 
observed  for  the  nuclear  explosions  (figure  5),  further  indicating  that  neither  source  radiation  or 
near  source  scattering  is  important  to  the  pattern  of  Lg/Pg  amplitude  ratio  variations. 


90 


Figure  4.8:  Event  classification  as  in  figure  6,  for  earthquake  no.  3  (Mb  =  4.4)  of  figure  2.  All  of  the 
stations  at  which  the  event  is  misclassified  are  clustered  at  the  greatest  distance  from  the  source. 
Although  this  alone  might  suggest  a  potential  distance,  or  near-receiver  scattering,  or  receiver  site 
effect,  we  find  a  very  different  pattern  for  earthquake  no.  1  (figure  9). 


Figure  4.9:  Event  classification  as  in  figures  6  and  8  for  earthquake  no.  1  (Mb  =  4.2)  of  figure  2.  In 
contrast  to  figure  8,  the  distinct  area  of  misclassifications  is  nearest  the  source,  and  at  a  completely 
different  set  of  stations.  Taken  together,  this  figure  and  figure  8  indicate  that  neither  distance,  near- 
receiver  scattering,  or  site  effects  control  the  pattern  of  relative  Lg  to  Pg  amplitudes  in  any  simple  way. 
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Figure  4.10:  Event  classification  as  in  figures  6,  8,  and  9,  for  earthquake  no.  6  (Mb  =  4.2)  of  figure  2. 
There  are  almost  no  misclassifications  for  this  event.  The  two  areas  in  which  the  smaller  amplitude 
ratios  cluster  (in  the  center  and  at  the  northwest  edge  of  the  network),  are  distinct  from  the  areas  of 
smaller  amplitude  ratio  for  the  events  from  different  azimuths  (figures  6,  8,  and  9),  strengthening  the 
argument  against  any  influence  of  distance,  near-receiver  scattering,  or  site  effects. 


Figure  4. 1 1 :  Pn  first  motion  polarity  for  earthquake  number  1  (figures  2  and  9).  Triangles  indicate 
positive  first  motion,  and  circles,  negative.  If  the  source  radiation  controlled  the  pattern  of  Lg/Pg 
amplitude  ratio  variation  observed,  we  might  expect  higher  Lg/Pg  amplitude  ratios  to  be  recorded 
along  P-nodal  lines.  Here,  the  polarities  suggest  a  possible  P-nodal  plane  running  roughly  northwest- 
southeast,  correlating  in  no  way  with  the  amplitude  ratio  pattern  observed. 


92 


Figure  4.12:  Event  classification  as  in  figures  6, 8,  9,  and  10,  for  earthquake  no.  2.  There  were  relatively 
fewer  observations  for  this  Mb  =  5.4  event,  due  to  clipping,  than  were  available  for  the  nearby  earth¬ 
quake  no.  3  (figure  8),  but  the  pattern  of  misclassifications  is  similar.  We  can  compare  this  result  to 
one  predicted  by  the  known  source  focal  mechanism  (figure  13). 


Figure  4.13:  Ratio  of  predicted  S  to  P  amplitudes  radiated  from  the  source  of  earthquake  no.  2.  The 
relative  amplitudes  have  been  normalized  as  in  figure  4.  The  largest  crosses  are  in  the  vicinity  of  the 
P-node,  just  the  opposite  of  the  observed  pattern  (figure  12).  This  suggests  that  at  approximately  1  Hz, 
and  in  this  distance  range,  the  focal  mechanism  has  little  effect  on  the  observed  Lg  and  Pg  amplitudes. 
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computational  aspects  of  modeling  wave  propagation  tractable.  Attempts  to  explain  Lg  blockage  by 
synthetic  modeling  have  limited  value,  however,  for  two  reasons.  One  is  simply  that  Lg  is  a  high 
frequency  phase  and  modeling  assumptions  about  the  homogeneity  of  the  earth  begin  to  break  down  at 
higher  frequency  (Gibson  and  Campillo,  1994).  The  second  problem  is  a  lack  of  appropriate 
observations  for  comparison.  The  feasibility  of  several  different,  sometimes  conflicting  mechanisms 
has  been  demonstrated  in  different  synthetic  modeling-based  studies.  For  example,  Cao  and  Muirhead 
(1993)  used  finite  difference  simulations  of  P-SV  waves  across  thinning  crust  overlain  by  sediments 
and  water,  and  concluded  that  the  water  column  is  critically  important  for  Lg  blockage.  Zhang  and  Lay 
(1995)  modeled  propagation  through  very  similar  structure,  also  using  finite-difference  simulations  as 
well  as  normal  mode  analysis,  to  argue  that  the  water  column  is  unnecessary.  They  conclude  that  Lg 
does  not  propagate  in  oceanic  crust  because  an  insufficient  number  of  higher  surface  wave  modes  can 
exist  in  crust  only  6  km  thick.  Other  approaches  indicate  the  feasibility  of  other,  contradictory, 
blockage  mechanisms.  Kennett  (1986)  uses  ray  diagrams  to  argue  for  the  predominance  of  back- 
scattering  by  dramatically  thinning  crust  at  the  continental-oceanic  crust  interface,  with  what  little 
energy  is  transmitted  into  the  oceanic  crust  rapidly  leaking  into  the  mantle.  Maupin  (1989)  finds  that 
coupled  mode  synthetics  predict  insufficient  backscattering  of  energy,  or  leakage  into  the  mantle,  to 
explain  the  extent  of  blockage  observed.  She  concludes  that  low  Q,  due  to  scattering  by  small  scale 
lower  crustal  heterogeneity,  likely  controls  Lg  blockage. 

The  difficulty  is  that  while  synthetic  studies  can  indicate  a  variety  of  feasible  mechanisms,  they 
cannot  unequivocally  determine  which  mechanism  is  actually  operating  in  the  earth.  Further 
constraints  on  the  models,  based  on  better  observations  than  are  currently  available,  are  necessary  to 
distinguish  between  the  proposed  mechanisms.  Baumgardt  (1990),  using  array  analysis,  demonstrated 
the  possibility  that  Lg  was  scattered  to  Sn  at  a  continent-ocean  transition,  providing  a  further  constraint 
on  synthetic  studies.  A  recent  study,  more  conclusive  due  to  the  density  of  spatial  sampling  (Shapiro,  et 
al.,  1996),  may  resolve  more  of  the  differences  between  synthetic  studies,  but  has  also  pointed  out  a 
major  problem  with  all  approaches  thus  far.  Most  studies  reporting  Lg  blockage  have  depended  on  very 
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widely  spaced  stations  and  events,  and  so  some  ambiguity  over  the  rate  of  Lg  blockage  has  existed.  It 
is  widely  quoted  in  the  modeling  literature  (e.g.  Kennett,  1986,  Cao  and  Muirhead,  1993,  Zhang  and 
Lay,  1995)  that  100  to  200  km  of  oceanic  crust  blocks  all  Lg  propagation,  and  modelers  have  used  this 
number  as  a  minimum  constraint  on  their  models.  In  fact,  in  the  experiment  of  Shapiro  et  al.  (1996),  7 
ocean  bottom  seismometers  covered  100  km  that  spanned  the  continent-ocean  transition.  That  dense 
coverage  revealed  that  Lg  scatters  effectively  to  Sn  and  to  slow  S  waves  in  the  ocean  bottom 
sediments,  disappearing  within  just  20  km.  None  of  the  mechanisms  proposed  in  the  previously 
discussed  studies  predict  such  a  rapid  loss  of  Lg  energy.  On  the  contrary,  it  is  challenging  to  identify  a 
mechanism  that  will  completely  attenuate  Lg  within  even  100  km  in  such  models.  Thus  while  the 
mechanisms  proposed  may  be  valid,  they  may  not  be  relevant. 

It  was  also  recognized  very  early  that  in  addition  to  blockage  by  oceanic  crust,  other  interruptions 
of  “normal”  continental  structure  could  block  Lg  transmission.  Bath  (1954)  reported  that  Lg  did  not 
propagate  across  the  Tibetan  Plateau  or  the  Caucasus.  Sedimentary  basins  have  also  been  observed  to 
attenuate  Lg  significantly  (e.g.  Baumgardt,  1990,  Ibanez  et  al.,  1991).  However,  continental  blockage 
is  more  equivocal  than  oceanic.  Some  mountain  ranges,  such  as  the  Tien  Shan,  appear  to  attenuate  Lg, 
but  don’t  completely  extinguish  it  (Ruzaikan,  et  al.,  1977).  The  Norwegian-Danish  basin,  with  8  to  10 
km  deep  sediments,  has  no  apparent  effect  on  Lg  propagation  (Gregerson,  1984).  Attempts  at  modeling 
continental  blockage  have  been  even  less  successful  than  those  aimed  at  explaining  oceanic  blockage, 
although  they  have  been  useful  in  paring  the  list  of  feasible  mechanisms.  For  example,  Gibson  and 
Campillo  (1994)  argue  that  because  neither  boundary-integral-equation  or  dynamic  ray  tracing 
synthetics  can  predict  the  Lg  blockage  observed  in  propagation  across  the  Pyrenees,  the  basic  large- 
scale  structure  of  the  mountain  range  is  not  the  cause.  They  suggest,  as  did  Maupin  (1989),  small  scale 
heterogeneity  in  the  lower  crust. 

As  in  the  case  where  Shapiro  et  al  (1996)  made  substantial  progress  in  understanding  Lg  blockage 
by  oceanic  crust,  by  making  appropriate  observations,  understanding  of  continental  Lg  blockage  will 
advance  with  better  observations.  Regional  events  from  many  azimuths  recorded  at  the  SCSN,  with  its 
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20  km  station  spacing  over  a  300  by  500  km  region  spanning  mountain  ranges  (with  and  without 
crustal  roots),  deep  and  shallow  sedimentary  basins,  an  active  rift  zone,  and  areas  of  both  rapidly  and 
gradually  varying  crustal  thickness,  can  handily  meet  this  need. 

The  need  for  site  amplifications 

To  make  it  clear  why  knowledge  of  the  amplification  at  all  recording  sites  is  desirable,  it  is  useful 
to  recapitulate  our  assessment  to  this  point. 

1)  Errors  in  the  Lg/Pg  amplitude  ratio  discriminant  can  be  attributed  to  propagation  effects. 

2)  To  improve  discrimination  it  is  our  goal  to  be  able  to  predict,  and  thus  correct  for,  those  propagation 
effects. 

3)  At  least  one  of  the  phases  involved,  Lg,  is  greatly  affected  by  changes  in  crustal  waveguide 
properties,  although  the  mechanisms  behind  those  changes  are  not  well  understood. 

In  fact,  Pg,  although  less  thoroughly  studied,  is  also  known  to  be  sensitive  to  variations  in 
waveguide  properties.  In  synthetic  studies,  efficient  Pg  propagation  is  shown  to  depend  on  the 
existence  of  a  low  velocity  surface  layer  (e.g.  Haskell,  1966;  Olsen,  et  al.,  1983).  It  is  not  clear  that  Lg 
and  Pg  are  sensitive  to  the  same  parameters,  at  least  in  the  same  way.  Thus,  while  we  could  develop 
empirical  relationships  based  on  the  correlations  between  changes  in  Lg/Pg  amplitude  ratios  and 
waveguide  properties,  their  predictive  value  elsewhere  would  be  dubious  without  an  understanding  of 
their  physical  basis.  To  understand  the  basis  for  the  changes  in  the  amplitude  ratios,  an  understanding 
of  the  propagation  of  each  phase  is  necessary.  That  understanding  will  be  based  on  correlations 
between  changes  in  path  properties  and  the  amplitudes  of  individual  phases.  To  make  such 
correlations,  it  will  be  necessary  to  separate  site  effects  from  path  effects,  for  which  knowledge  of  site 
amplifications  is  necessary. 

As  we  are  concentrating  first  on  understanding  Lg  propagation,  we  will  attempt  to  calibrate  the 
SCSN  sites  for  Lg  propagation.  Note  that  when  amplitude  ratio  discriminants  are  used,  it  is  assumed 
implicitly  that  site  amplifications  are  the  same  for  both  phases  used,  a  possible  source  of  error  if 
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differences  exist  and  are  mapped  into  path  corrections.  Barker  et  al.  (1980),  however,  found  that  for 
recordings  of  events  with  very  similar  paths  to  adjacent  stations  on  very  different  geological  structures, 
the  Lg  to  Pg  amplitude  ratios  were  the  same.  Eventually,  we  must  examine  carefully  the  question  of 
how  sensitive  site  amplifications  are  to  wavetype.  If  the  Lg  site  amplifications  prove  to  be  appropriate 
for  Pg  as  well,  gaining  understanding  of  that  phase’s  variations  will  be  much  simpler. 

The  rest  of  this  thesis  is  devoted  to  enabling  the  observation  of  variations  in  absolute  Lg  amplitudes 
along  all  types  of  propagation  paths.  To  do  so,  we  must  calibrate  all  possible  SCSN  sites  for  Lg 
amplification,  which  we  do  using  the  near  receiver  scattered  ( diffuse )  component  of  teleseismic  coda  as 
an  isotropic  source  of  Lg-like  energy.  The  coda  measurements  are  noisy,  with  the  possibility  of  many 
large  errors,  and  the  data  are  censored,  so  in  chapter  5  we  present  the  development  and  application  of 
appropriate  statistical  techniques  for  obtaining  the  best  site  amplification  estimates  possible  from  the 
data.  In  chapter  6,  we  examine  the  nature  of  diffuse  coda  and  Lg,  both  through  a  review  of  the 
literature  and  through  array  analysis.  We  also  develop  and  apply  a  method  for  separating  the  near 
receiver  from  the  near  source  scattered  coda.  In  chapter  7  we  present  a  preliminary  application  of  the 
site  amplifications  to  Lg,  and  discuss  the  steps  necessary  to  quantify  the  relationship  between  structure 
and  Lg  amplitude  variation. 


97 


References 


Barker,  B.W.,  Z.A.  Der,  and  C.P.  Mrazek  (1981),  The  effect  of  crustal  structure  on  the  regional  phases 
Pg  and  Lg  at  the  Nevada  Test  Site,  J.  Geophys.  Res . ,  86,  1686-1700 

Bath,  M.  (1954),  The  elastic  waves  Lg  and  Rg  along  Euroasiatic  paths,  Arkivfor  Geofysik ,  2, 295-325 

Baumgardt,  D.R.  (1984),  Relative  Lg  and  P-coda  magnitude  analysis  of  the  Shagan  River  Explosions, 
Final  Report  toAFOSR,  SAS-TR-84-03 ,  ENSCO,  Inc.,  Springfield  Virginia 

Baumgardt,  D.R.  (1985),  Attenuation,  blockage,  and  scattering  of  teleseismic  Lg  from  underground 
nuclear  explosions  in  Eurasia,  ENSCO,  Inc.,  Scientific  Report,  No.  1,  AFGL-TR-85-0332 

Baumgardt,  D.R.  (1990),  Investigation  of  teleseismic  Lg  blockage  and  scattering  using  regional  arrays, 
Bull .  Seis.  Soc.  Am. ,  80,  226 1  -228 1 

Cao,  S.  and  K.I  Muirhead  (1993),  Finite  difference  modeling  of  Lg  blockage,  Geophys.  J.  Inti. ,  115, 
85-96 

Gibson  R.L.  Jr.  and  M.  Campillo  (1994),  Numerical  simulation  of  high-  and  low-frequency  Lg-wave 
propagation,  Geophys.  J.  Inti ,  118,  47-56 

Gregerson,  S.  (1984),  Lg-wave  propagation  and  crustal  structure  differences  near  Denmark  and  the 
North  Sea,  Geophys.  J.  R.  Astr.  Soc.,  79,  217-234 

Harjes,  H.-P.,  (1996)  Towards  a  global  seismic  monitoring  system  -  lessons  learned  from  Geneva 
experiments,  in  Monitoring  a  Comprehensive  Test  Ban  Treaty ,  Kluwer  Academic  Publishers 

Haskell,  N.A.  (1966),  The  leakage  attenuation  of  continental  crustal  P  waves,  J.  Geophys.  Res.,  71, 
3955-3967 

Ibanez,  J.M.,  J.  Morales,  F.  de  Miguel,  F.  Vidal,  G.  Alguacil,  and  A.M.  Posadas  (1991),  Effect  of  a 
sedimentary  basin  on  estimations  of  Qc  and  QLg,  Phys.  Earth  Plan.  Int.,  66,  244-252 

Kennett,  B.L.N.  (1986),  Lg  waves  and  structural  boundaries,  Bull.  Seis.  Soc.  Am. ,  76,  1 133-1141 

Maupin,  V.  (1989)  Numerical  modeling  of  Lg  wave  propagation  across  the  North  Sea  Central  Graben, 
Geophys.  J.  Inti. ,  99, 273-283 

Mueller,  R.A.  and  J.R.  Murphy  (1971),  Seismic  characteristics  of  underground  nuclear  detonations, 
Bull.  Seis.  Soc.  Am.,  61,  1675-1692 

Nuttli,  O.W.  (1973),  Seismic  wave  attenuation  and  magnitude  relations  for  eastern  North  America,  J. 
Geophys.  Res. ,  78,  5212-5218 

Olsen,  K.H.,  L.W.  Braile,  and  J.N.  Stewart  (1983),  Modeling  short-period  crustal  phases  (P,  Lg)  for 
long-range  refraction  profiles,  Phys.  Earth  Plan.  Int.t  31,  334-347 

Press  F.  and  M.  Ewing  (1952),  Two  slow  surface  waves  across  North  America,  Bull.  Seis.  Soc .  Am., 
42,  219-228 

Ringdal,  F.,  and  B.K.  Hokland  (1987),  Magnitudes  of  large  Semipalatinsk  explosions  using  P  coda  and 
Lg  measurements  at  NORSAR,  NORSAR  Sci .  Rept.  No.  1-88/89 ,  Kjeller,  Norway 


98 


Ruzaikin,  A.I.,  I.L.  Nersesov,  and  V.I.  Khalturin  (1977)  Propagation  of  Lg  and  lateral  variations  in 
crustal  structure  in  Asia,  J.  Geophys.  Res.,  82,  307-316 

Shapiro,  N.,  N.  Bethoux,  M.  Campillo,  and  A.  Paul  (1996),  Regional  seismic  phases  across  the 
Ligurian  Sea:  Lg  blockage  and  oceanic  propagation,  Phys.  Earth  Plan.  Int.,  93, 257-268 

Simpson,  D.,  R.  Butler,  T.  Ahern,  and  T.  Wallace,  (1996)  Expanding  the  global  seismographic 
network,  in  Monitoring  a  Comprehensive  Test  Ban  Treaty,  Kluwer  Academic  Publishers 

Taylor,  S.R.,  M.  Denny,  E.  Vergino,  and  R.  Glaser  (1989),  Regional  discrimination  between  NTS 
explosions  and  western  U.S.  earthquakes,  Bull.  Seis.  Soc.  Am.,  79  1142-1176 

Stevens,  J.L.  and  S.M.  Day  (1985),  The  physical  basis  of  mt>:Ms  and  variable  frequency  magnitude 
methods  of  earthquake/explosion  discrimination,  J.  Geophys.  Res.,  90, 3009-3020 

Zhang,  T.,  S.  Schwartz,  and  T.  Lay  (1994),  Multivariate  analysis  of  waveguide  effects  on  short-period 
regional  wave  propagation  in  Eurasia  and  its  application  in  seismic  discrimination,  J.  Geophys. 
Res. ,  99,  21929-21945 

Zhang  T.  and  T.  Lay  (1995),  Why  the  Lg  phase  does  not  traverse  oceanic  crust,  Bull.  Seis.  Soc.  Am., 
85,1665-1678 


Chapter  5 


Iterative  reweighting  for  estimation  of  magnitude  and  site  amplifications 
from  doubly  censored  and  corrupted  data 


Abstract 

Geophysical  data  are  commonly  drawn  from  heavy  tailed  distributions  and  are  often  censored.  We 
present  the  use  of  two  techniques  that  improve  parameter  estimation  from  such  data.  The  first,  the 
technique  of  robust  reweighting  of  data  based  on  misfit,  limits  bias  in  parameter  estimates  either  from 
outliers  or  non-Gaussian  distributed  errors  and  improves  accuracy  of  error  analysis.  The  second 
technique  permits  the  incorporation  of  censored  data  into  parameter  estimates  through  maximum 
likelihood  estimation. 

We  use  the  example  of  event  magnitude  and  site  amplification  estimation  from  censored  seismic 
network  data,  with  both  synthetic  examples  and  real  data,  to  illustrate  the  implementation  and 
effectiveness  of  these  techniques.  For  the  second  technique,  we  derive  the  likelihood  function  for  the 
problem  and  make  a  linear  approximation  to  find  its  maximum  with  an  iterative  algorithm. 

Problems  addressed 

We  address  some  problems  common  to  geophysical  data  that  can  cause  significant  errors  in 
estimated  parameters. 

The  first  problem  is  that  of  non-normally  distributed  errors.  Least  squares  is  probably  the  most 
common  method  used  to  estimate  parameters  from  geophysical  data.  Its  popularity  is  due  to  the 
efficiency  of  its  calculation  and  its  ease  of  implementation.  The  least  squares  solution,  however,  is  the 
optimal  solution  only  if  the  errors  are  normal.  The  assumption  of  normality  is  often  not  even  made 
explicitly  or  tested,  although  in  reality,  data  are  commonly  drawn  from  heavy-tailed  distributions  and 
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have  significantly  more  outliers  than  predicted  by  the  normal  distribution  (e.g.  Huber,  1972).  This  can 
bias  the  least  squares  solution  significantly  through  the  square  in  the  error  term.  The  term  outliers 
refers  to  measurements  that  appear  to  fall  well  outside  the  distribution  of  the  great  majority  of  the  data, 
whatever  that  distribution  is.  By  heavy-tailed ,  we  refer  to  any  distribution  with  heavier  tails  than  the 
normal  distribution.  Both  terms  will  be  useful  in  the  discussion  that  follows,  although  we  note  that 
whether  a  particular  datum  is  an  outlier  or  is  drawn  from  a  heavy  -tailed  distribution  may  be  mostly  a 
matter  of  one’s  outlook. 

For  perspective  on  the  ubiquity  of  the  assumption  of  normality,  we  mention  an  interesting 
historical  note.  That  is,  that  the  normal  distribution  was  introduced  by  Gauss,  in  1821,  not  because  of 
its  omnipresence  in  nature,  but  because  it  is  the  distribution  for  which  the  easily  calculated  arithmetic 
mean  is  the  best  estimate  (Huber,  1972). 

The  other  problem  we  address  is  the  censoring  of  data.  By  censoring,  we  mean  that  large  signals 
are  clipped  at  some  recording  instruments,  while  small  signals  remain  below  background  noise  levels 
at  some  other  instruments.  That  a  signal  is  above  or  below  some  threshold  is  useful  information  and  its 
exclusion  can  significantly  bias  parameter  estimates  (e.g.  Ringdal,  1977,  Blandford  and  Schumway, 
1982). 

The  problem  of  site  amplification  and  magnitude  estimation  from  SCSN  data 

To  facilitate  the  presentation  of  the  techniques  that  deal  with  the  above  problems,  we  use  the 
calculation  of  site  amplifications  from  the  near-receiver  scattered  component  of  teleseismic  coda  for 
southern  California  seismic  network  (SCSN)  stations.  A  thorough  discussion  of  the  SCSN  site 

amplifications  can  be  found  in  Baker,  et.  al.  (1996). 

To  estimate  the  site  amplifications,  we  must  also  estimate  event  magnitudes.  The  problem  is  very 
straightforward:  we  measure  rms  amplitudes  of  near  source  scattered  teleseismic  coda  from  many 
events  at  all  stations  within  a  network.  We  write  each  measured  amplitude  as 
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\j  -  Ej  '  Si  '  E‘J 


(1) 


where  S[  is  the  site  amplification  at  station  i,  Ej  is  the  rms  amplitude  of  the  coda  over  some  time 
window  for  event  j  (for  unit  site  amplification),  and  Fj  j  is  a  factor  that  accounts  for  random  error  from 
all  possible  sources.  We  take  the  natural  logarithm  of  (1),  to  obtain 

aiJ=eJ+si  +  yiJ.  (2) 


We  use  a  multiplicative  factor  to  describe  the  noise  in  (1),  as  the  errors  in  the  magnitudes  and 
logarithms  of  the  site  amplifications  are  assumed  to  be  additive.  If  y[  j  were  assumed  to  be  independent 
zero  mean  Gaussian  errors  with  standard  deviations  Cjj,  the  least  squares  solution  to  the  matrix 
equation  below  would  be  the  best  solution  we  could  find  (e.g.  Press,  et.  al.,  1988).  The  assumption  that 
Yij  are  normal  is  equivalent  to  taking  T\  j  to  be  lognormal,  a  distribution  commonly  held  to  result  from 
the  multiplicative  combination  of  independent  random  variables  (e.g.  Priestley,  1981). 
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The  last  row  of  the  matrix  constrains  the  network  to  have  no  mean  amplification  and  prevents 
event  and  station  parameters  from  trading  off.  The  matrix  would  only  reach  its  full  height  of  p  q+1  if 
there  were  no  missing  data.  Expressing  (3)  as  M-e=a,  we  can  write  the  least  squares  solution  in  matrix 
form  as  e  =  (MrM)_1Mra  (e.g.  Lawson  and  Hanson,  1974). 

We  have  reason,  however,  to  doubt  that  Ti  j  are  independent  zero  mean  Gaussian  errors.  The  model 
we  use  requires  a  large  number  of  assumptions  (e.g.  we  assume  that  near-receiver  scattering  is 
isotropic,  that  the  incoming  teleseismic  P-wave  has  constant  amplitude  over  southern  California,  and 
that  instrument  calibrations  do  not  drift  over  time).  It  is  easy  to  see  that  major  violations  of  any 
assumptions  could  lead  to  large  outliers.  Further,  SCSN  seismograms  are  often  mislabeled,  so  some 
measurements  may  be  attributed  to  the  wrong  stations,  causing  occasional  very  large  errors  (we 
estimate  that  as  much  as  1%  of  the  data  that  we  have  used  may  be  mislabeled  in  this  way).  A  statistical 
test  of  the  error  distribution,  presented  in  a  later  section,  confirms  this  skepticism.  In  the  following 
sections  we  will  address  the  consequences  of  using  the  least  squares  estimate  when  the  errors  are  not 
normal  and  will  use  the  site  amplification  problem  to  illustrate  a  method  of  estimation  that  is  robust  to 
such  difficulties  with  the  data. 

Size  is  one  further  important  factor  in  this  problem.  We  have  recordings  of  41  events  on  211 
stations,  for  a  total  of  4,397  on-scale  amplitude  measurements,  110  upper  threshold  measurements,  and 
703  lower  threshold  measurements.  Despite  the  large  number  of  measurements,  some  individual 
parameters  depend  on  few  data,  so  if  one  datum  is  in  some  way  bad,  we  need  a  way  of  identifying  it. 
Because  the  entire  set  of  data  in  this  case  is  so  large,  our  means  of  recognizing  dubious  data  must  be 
automatic. 

Additionally,  because  seismometers  have  limited  dynamic  range,  the  data  are  censored.  Ignoring 
the  censored  data  may  result  in  biased  site  amplification  estimates  and  event  magnitudes.  We  illustrate 
a  method  for  incorporating  censored  data  into  the  parameter  estimates  in  later  sections. 
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Robust  reweighting 

The  first  set  of  problems  we  introduced,  heavy-tailed  distributions  and  egregious  blunders 
contaminating  the  data,  are  dealt  with  by  robust  statistics  techniques.  The  simplest  robust  statistics 
technique  is  probably  that  of  truncating  the  data,  that  is,  discarding  some  number  of  the  largest  and 
smallest  measurements,  and  computing  the  least  squares  solution  with  the  remaining  data.  The  simplest 
a  posteriori  technique  is  to  compute  a  least  squares  solution,  discard  any  data  considered  to  be  outliers, 
for  example,  anything  with  greater  than  3  standard  deviations  of  misfit,  and  then  re-compute  the  least 
squares  solution  with  the  remaining  data.  Any  such  technique  is  driven  by  the  consideration  that  some 
data  are  likely  to  be  bad.  This  may  also  be  viewed  as  insufficiency  of  the  model,  in  which  case 
unmodeled  parameters  are  mapped  into  the  modeled  parameters.  Either  way,  outlying  data  must  be 
identified  and  discarded  or  downweighted,  or  they  may  severely  bias  the  least  squares  solution, 
specifically  because  that  solution  seeks  to  minimize  the  squared  error. 

We  use  the  technique  of  robust  reweighting,  which  has  distinct  advantages  over  the  simpler  robust 
techniques  mentioned  above.  The  advantages  derive  from  its  ability  to  downweight  data,  rather  than 
simply  keep  or  discard  them.  By  just  discarding  outliers,  the  assumption  is  made  that  those  data  are 
bad,  that  is,  that  some  error  must  have  been  made  in  their  collection.  In  that  way,  potentially  useful 
data  that  are  simply  drawn  from  a  heavy-tailed  distribution  are  likely  to  be  discarded.  In  contrast, 
reducing  the  weights  applied  to  outlying  data,  which  can  be  viewed  as  increasing  the  estimate  of  a\  j  in 
(3),  permits  application  of  a  milder  penalty  than  would  normally  be  applied  to  such  data  by  the  least 
squares  penalty  function.  Thus,  such  data  can  influence  the  solution  without  dominating  it.  Similarly,  if 
the  data  are  drawn  from  a  heavy-tailed  distribution,  the  least  squares  solution  is  likely  to  be  biased  by 
even  moderately  outlying  data  that  are  not  discarded.  By  downweighting  moderately  outlying  data,  the 
penalty  applied  to  them  can  be  reduced,  for  example  to  something  like  the  penalty  applied  by  the  Lj- 
norm  estimate.  The  Li-norm  estimate  is  more  appropriate  for  data  from  a  heavy-tailed  distribution,  but 
it  is  much  more  cumbersome  to  implement  and  slower  to  calculate  than  the  least  squares  estimate, 
which  robust  reweighting  lets  us  continue  to  use.  Versions  of  robust  reweighting  of  data  have  been 
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applied  in  geophysics  for  some  time  (e.g.  Jeffreys,  1932),  although  the  practice  is  not  as  widespread  as 
may  be  warranted. 

The  identification  and  downweighting  of  outliers  is  an  iterative  procedure.  A  set  of  weights  is 
determined  by  a  reweighting  function  that  is  dependent  on  the  misfit.  The  misfit  is 

zn  =  Wn  •  M-en  -  W„  •  a,  (4) 


where  M  is  the  forward  model  matrix  in  any  linear  inverse  or  parameter  estimation  problem  written  as 
M-e=a,  such  as  (3),  a  is  the  data  vector,  en  is  the  estimate  of  the  vector  of  parameters  at  the  n  th 
iteration,  and  Wn  is  the  diagonal  weight  matrix  that  was  applied  at  the  n’th  iteration.  A  new  set  of 
weights  is  calculated  based  on  the  misfit,  and  the  least  squares  solution  of  the  newly  reweighted 
problem, 
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is  found.  The  process  continues  until  some  criterion  is  met. 

The  values  of  the  robust  weights  are  calculated  from  a  reweighting  function.  The  integral  of 
reweighting  functions,  called  influence  functions,  are  more  commonly  discussed  in  the  statistics 
literature.  The  integrals  of  the  influence  functions  are  the  penalty  functions  associated  with  the 
reweighted  least  squares  inversions.  A  common  robust  reweighting  penalty  function  and  the  least 
squares  and  L  i-norm  penalty  functions  are  shown  in  figure  1,  along  with  their  associated  influence  and 
reweighting  functions  (note:  the  Lj-norm  penalty  function  is  shown  only  for  comparison  with  the 
others,  but  has  no  true  influence  function  as  the  weights  approach  infinity  near  zero  misfit).  In  the 
upper  left  panel  we  can  see  the  overwhelming  penalty  applied  by  the  least  squares  criterion  to  data  that 
cause  moderate  to  large  misfits.  It  is  through  that  large  penalty  that  outliers  can  significantly  bias  the 
least  squares  estimate.  The  L ]  -norm  penalty  is  more  appropriate  for  data  from  a  heavy  tailed 
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distribution  (the  Lj-norm  solution  is  the  maximum  likelihood  estimator  for  exponentially  distributed 
data),  although  it  is  still  very  large  for  egregious  errors.  An  influence  function  such  as  the  Hampel  17a 
function  (e.g.  Montgomery  and  Peck,  1982),  in  which  the  influence  of  large  outliers  decreases  until 
they  are  eventually  completely  ignored  is  more  appropriate  for  data  sets  where  occasional  egregious 
errors  are  expected.  Such  influence  functions,  that  is,  those  with  a  negative  slope,  are  called 
redescenders.  Convergence  is  guaranteed  only  when  the  influence  function  is  convex,  which 
redescenders  are  not.  In  practice  however,  convergence  is  usually  rapid  and  convergence  problems  are 
rare  (e.g.  Montgomery  and  Peck,  1982). 

The  Hampel  17a  influence  function  is  linear,  like  the  least  squares  solution,  for  data  with  little 
misfit,  constant,  like  the  L  i-norm  solution,  for  data  with  moderate  misfit,  and  reduces  the  influence  of 
data,  eventually  to  zero,  outside  of  that  range.  The  robust  weights  for  the  Hampel  17a  function  are 


determined  as  follows: 
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Figure  5.1:  Penalty  functions  (upper  left)  for  misfit  of  least  squares,  Ll-norm,  and  Hampel  17a 
solutions  and  their  1st  and  2nd  derivatives,  i.e.,  influence  functions  (upper  right)  and  the  values 
for  robust  weights  (lower  left).  Note  that  the  Ll-norm  does  not  have  a  true  influence  function, 
as  its  robust  weights  would  approach  infinity  near  zero  misfit.  It  is  included  here  for  comparison 
with  the  other  solutions. 
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w(z)=l, 

W(Z)=R’ 

w(z)  =  gj$. 

w(z)=0, 


for  lzl<a 
for  a<lzl<b 

for  b<lzl<c 

for  lzl>c 


(6) 


where  z  is  the  misfit  from  (4),  normalized  by  some  robust  estimate  of  the  standard  deviation,  such  as 
the  scaled  median  absolute  deviation, 


S  = 


_  median(|z  -  median(z)|) 


(7) 


(e.g.  Montgomery  and  Peck,  1982).  zm =0.6745  is  obtained  from 


J 


dz  =  0.25. 


o 


so  that  when  z  is  standard  normal  (i.e.  |i=0  and  <7=1.0),  half  of  the  points  will  be  between  -zm  and  +Zm, 
and  s  will  equal  1 .0. 

Because  reweighting  functions  downweight  data  on  the  tails  of  the  distribution,  they  effectively 
“reshape”  the  distribution  to  appear  more  Gaussian.  That  is  not  to  suggest  that  it  is  important  to  perturb 
reweighting  function  parameters  (a,  b,  and  c  of  (6)  in  the  case  of  the  Hampel  17a  function)  to  exactly 
reshape  the  errors  to  be  Gaussian.  There  is  nothing  that  suggests  that  would  be  meaningful.  What  is 
important  is  that  the  final  parameter  estimates  be  robust  with  respect  to  the  choice  of  both  influence 
function  and  influence  function  parameters.  We  used  a=1.7,  b=3.4,  and  c=8.5  for  our  final  estimates,  as 
suggested  by  Montgomery  and  Peck  (1982).  Those  estimates  were  robust  to  large  variations  in 
influence  function  parameter  values.  For  example,  we  varied  a  from  1.1  to  2.3.  and  used  other 
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influence  functions,  without  significantly  affecting  the  values  of  the  magnitude  and  site  amplification 
estimates. 

We  have  described  the  robust  reweighting  as  a  way  of  improving  parameter  estimates  by  reducing 
the  undue  influence  outliers  have  on  a  least  squares  solution.  It  could  also  be  viewed,  more  basically, 
as  a  way  of  determining  the  most  accurate  estimates  of  the  data’s  standard  deviations.  We  attempt  to 
achieve  that  by  updating  the  standard  deviation  estimates  for  each  datum  based  on  how  closely  that 
datum  corresponds  to  other  data  for  the  same  event  or  station. 

In  addition  to  better  parameter  estimates,  another  distinct  advantage  of  robust  reweighting  is  that 
the  covariance  matrix  becomes  a  much  more  accurate  estimate  of  the  uncertainties  in  the  parameters. 
That  is  so,  because  the  diagonal  elements  of  the  covariance  matrix,  C=[(WM  )^(WM )]"  ^  (where  W  is 
the  final  weighting  matrix  and  M  is  the  forward  modeling  matrix  as  above)  are  estimates  of  the 
variance  when  scaled  by  the  total  misfit 


a2  = 


u  —  v 


(9) 


(e.g.  Lawson  and  Hanson,  1974).  The  accuracy  of  those  estimates,  however,  depends  on  the  weights 
being  proportional  to  the  inverses  of  the  standard  deviations. 


Trade-offs  between  the  standard  least  squares  and  the  robust  solutions 

We’ve  suggested  that  if  errors  are  not  normally  distributed  or  the  data  include  some  large  outliers, 
the  least  squares  solution  could  provide  a  poor  estimate  of  some  parameter  values.  If  on  the  other  hand, 
the  errors  are  normally  distributed,  the  least  squares  solution  is  the  maximum  likelihood  solution.  Any 
other  solution,  including  the  robust  reweighted  solution,  would  be  worse,  so  there  is  a  price  to  be  paid 
for  using  robust  reweighting.  We  illustrate  with  a  simple  synthetic  example  the  relative  extent  of  error 
we  risk  from  using  each  approach. 
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We  constructed  an  exact  synthetic  data  set  assuming  10  stations  with  various  site  amplifications 
recording  20  events  of  various  magnitude.  We  then  added  normally  distributed  random  noise  scaled  to 
have  a  standard  deviation  equal  to  10%  of  the  expected  log  amplitude  for  each  measurement,  and 
found  the  least  squares  and  robust  estimates.  We  repeated  the  experiment  100  times  to  determine 
accurately  the  typical  error  of  each  type  of  solution  (Table  1).  The  average  error  of  the  robust  estimates 
of  magnitude  are  only  3%  worse  than  those  of  the  least  squares  estimates  (2.65%  vs.  2.57%),  and  the 
site  amplification  estimates  are  only  3.5%  worse  (9.96%  vs.  9.623%).  However,  when  the  added  noise 
is  taken  from  the  heavier-tailed  exponential  distribution,  the  least  squares  magnitude  estimates  are  12% 
worse  than  the  robust  estimates  (3.6%  Vs  3.22%)  and  the  site  amplification  estimates  are  17%  worse 
(13.15%  vs.  11.23%).  The  difference  is  even  greater  if  there  are  some  large  outliers  in  the  data.  For  a 
data  set  as  above,  but  with  a  couple  of  measurements  significantly  off  (one  measurement  is  ~2.5  times 
too  large,  and  one  is  -2  times  too  small),  the  least  squares  estimates  are  much  worse  than  the  robust 
reweighted  estimates.  Those  2  bad  measurements  affect  the  estimates  of  4  parameters,  the  2  events 
involved  and  the  2  stations.  Over  100  runs  (adding  different  random  exponential  noise,  but  the  same  2 
blunders),  the  mean  least  squares  estimates  for  those  parameters  were  significantly  worse  than  the 
robust  estimates  (Table  1).  The  difference  is  due  to  the  robust  weights  assigned  to  the  those  parameters. 
In  the  least  squares  solution,  all  measurements  are  assigned  equal  weight.  In  the  robust  estimates,  the  2 
dubious  data  are  recognized  and  assigned  weights  of  about  0.36  (the  mean  weight  for  all  other 
measurements  in  the  final  robust  iteration  was  0.965). 


Table  5.1:  Differences  in  mean  errors  between  the  least  squares  (second  and  third  columns) 
and  robust  (rightmost  two  columns)  solutions  to  synthetic  data  with  normal  (top  row)  and 
exponential  (second  row)  noise,  and  large  outliers  (bottom  rows). 
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Type  of 

noise  added 

Mean  least  squares  error 

Mean  robust 

»olution  error 

Magnitudes 

Amplifications 

Magnitudes 

Amplifications 

Gaussian 

2.57% 

9.62% 

2.65% 

9.96% 

Exponential 

3.60% 

13.15% 

3.22% 

11.23% 

Magnitudes 

L.S.  (true) 

Amplifications 

L.S.  (true) 

Magnitudes 

robust  (true) 

Amplifications 

robust  (true) 

Exponential  +2 

blunders 

6.74  (6.5) 

6.33  (6.5) 

0.342  (0.300) 

0.341  (0.375) 

6.55  (6.5) 

6.52  (6.5) 

0.309  (0.300) 

0.380  (0.375) 

The  effect  of  robust  reweighting  on  real  data 

In  considering  whether  to  use  a  particular  technique,  one  wishes  to  know  whether  the 
improvement  in  resolution  will  be  sufficient  to  justify  the  time  and  effort  required  to  implement  it.  This 
section  demonstrates  the  considerable  advantages  of  robust  reweighting  over  standard  least  squares  for 
a  real  geophysical  problem.  As  one  would  expect  from  the  earlier  technique  description,  the  additional 
programming  required  to  implement  robust  reweighting,  once  the  least  squares  estimate  has  been 
accomplished,  is  almost  negligible. 

For  the  SCSN  problem,  we  do  not  have  to  simply  guess  what  the  error  distribution  is.  Once  we 
have  a  solution,  we  can  test  whether  the  errors  fit  the  assumed  distribution.  Although  the  Li-norm 
solution  is  generally  preferred  as  an  initial  solution  for  robust  techniques  as  it  is  not  nearly  as  sensitive 
to  outliers  as  the  least  squares  solution,  it  is  unwieldy  to  calculate  for  such  a  large  system.  Hence  we 
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begin  with  a  standard  least  squares  solution  weighted  only  by  signal-to-pre-event-noise  ratios  (the 
validity  of  that  a  priori  estimate  of  the  standard  deviation  is  discussed  in  Appendix  1).  We  perform  a 
Kolmogorov- Smirnov  test  on  the  misfits,  and  find  that  the  distribution  is  normal  with  probability  zero. 
We  can  see  that  the  actual  error  distribution  has  much  heavier  tails  than  the  normal  distribution,  and 
has  some  very  large  errors  (Figure  2).  The  heavier-than-Gaussian  tails  and  very  large  errors  indicate 
that  robust  re  weighting  is  warranted.  The  robust  weights  are  determined  based  on  the  Hampel  17a 
influence  function  with  equation  (6)  parameters  set  to  values  of  a=1.7,  b=3.5,  and  c=8.  Although  these 
are  typical  suggested  values  (e.g.  Montgomery  and  Peck,  1982),  we  re-iterate  that  it  is  most  important 
that  the  final  parameter  estimates  be  robust  to  a  wide  range  of  choices  of  a,  b,  and  c. 

Even  limiting  consideration  to  those  recording  at  least  10  events  on-scale,  we  found  dramatic 
changes  in  site  amplifications  due  to  robust  reweighting.  28  sites  (out  of  189  recording  10  or  more 
events)  had  changes  of  10%  or  greater.  13  changed  by  more  than  20%.  Examining  the  data  that  were 
downweighted  at  sites  with  large  estimates  of  variance  reveals  the  power  of  the  robust  reweighting  to 
identify  dubious  data  that  would  otherwise  almost  certainly  go  unnoticed. 

We  begin  with  a  simple  example,  station  SBK,  which  had  the  4th  largest  variance  and  a  change  in 
site  amplification  from  0.51  to  0.83.  Figure  3  (top  left  panel)  shows  that  7  recordings  out  of  29  were 
given  nearly  zero  weight,  and  all  7  are  from  the  same  continuous  time  interval  (the  beginning  of  1992 
through  to  the  beginning  of  1994).  Clearly,  something  was  different  about  the  station  during  that 
interval.  We  suppose  that  an  indication  of  the  station’s  magnification  can  be  obtained  by  background 
noise  levels,  and  so  compare  the  pre-event  noise  vs.  the  date  (second  row,  left  column)  and  the  robust 
weights  vs.  pre-event  noise  (bottom  row,  left  column).  Pre-event  noise  levels  much  lower  than  were 
typical  for  SBK  were  recorded  for  the  downweighted  events.  We  conclude  that  the  instrument’s 
magnification  was  reduced  during  this  time  period,  without  the  change  in  instrument  parameters  being 
recorded.  In  this  case  robust  re-weighting  has  greatly  improved  a  parameter  estimate  by  removing 
dubious  data.  For  this  station,  the  same  end  might  have  been  achieved  via  a  careful,  and  lengthy, 
examination  of  pre-event  noise  levels  vs.  time  for  all  stations  (although  it  is  only  through  the  robust 
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Figure  5.2:  Quantile-quantile  plots  for  the  misfits  from  the  least  squares  (top),  and  robustly  reweighted 
solution  (bottom).  If  the  misfits  were  normally  distributed,  the  points  would  lie  along  the  diagonal. 

The  large  deviation  from  the  diagonal  beginning  at  approximately  2  standard  deviations,  in  the  least 
quares  case,  indicates  that  the  distribution  of  the  error  is  much  heavier-tailed  than  normal.  The  very 
large  outliers  most  likely  indicate  outright  blunders  in  the  data. 
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SBK 


LUC 


Figure  5.3:  Plots  of  robust  weights  vs.  event  dates  (top  row)  for  two  of  the  stations  most  affected  by  the 
robust  reweighting,  showing  a  distinct  grouping  of  low  weights  within  specific  time  periods.  Pre-event 
noise  levels  vs.  event  dates  (middle  row)  indicate  that  magnification  was  likely  significantly  lower  than 
was  recorded  in  instrument  parameter  logs  for  station  SBK  (left  column)  during  that  time  period.  The 
robust  weights  are  plotted  vs.  the  pre-event  noise  levels  in  the  bottom  row.  The  correlation  is  perfect 
or  SBK,  but  more  muddled  for  LUC  (right  column) 
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reweighting  that  it  occurred  to  us  to  perform  such  a  test),  or  by  simply  throwing  out  very  large  outliers. 
The  next  example  is  not  as  straightforward,  and  could  not  have  been  resolved  by  other  such  means. 

The  site  amplification  at  LUC  had  by  far  the  largest  variance  and  the  greatest  change,  from  0.51  to 
1.02.  13  out  of  the  26  recordings  were  given  robust  weights  of  0.3  or  less,  and  1 1  of  them  are  from  the 
same  continuous  time  period  (figure  3,  upper  right).  As  with  SBK,  it  appears  likely  that  something 
about  the  instrument  response  was  different  during  that  time  period,  but  just  what  is  less  clear.  The  pre- 
event  noise  level  was  higher  for  the  4  events  in  1990,  but  from  then  on  the  level  remained  within  a 
roughly  constant  range.  Something  else,  perhaps  a  change  in  the  seismometer’s  response  to  velocities 
much  higher  than  background  noise  levels,  appears  to  have  changed.  Although  we  cannot  identify  the 
exact  cause,  the  suspicious  grouping  of  low  weights  with  time  indicates  a  problem  with  the  instrument 
response  that  the  robust  reweighting  has  effectively  alerted  us  to.  Similar  problems  with  incorrect 
instrument  response  information  appear  to  be  at  the  root  of  the  changes  in  most  other  stations  with  high 
variances.  In  no  case  did  low  robust  weights  clearly  correlate  with  event  azimuth,  depth,  or  size.  For  a 
small  number  of  stations  with  high  variances,  a  clear  cause  is  not  evident.  It  may  be  that  multiple 
causes  obscure  the  explanation  (e.g.  incorrect  instrument  parameters  for  multiple  time  periods, 
mislabeling  of  some  seismograms  resulting  in  their  attribution  to  the  wrong  stations).  We  cannot  be 
certain  in  such  cases,  as  we  are  for  most,  that  the  parameter  estimate  has  been  improved  by  the  robust 
reweighting,  but  the  technique  has  clearly  alerted  us  that  those  site  amplifications  as  poorly 
constrained. 

The  magnitude  estimates  were  not  as  strongly  affected,  with  the  greatest  change  between  least 
squares  and  robust  solutions  being  2.7%,  or  0.2  magnitude  units.  Upon  re-examining  the  records  for 
that  event,  we  see  that  5  of  the  107  recordings  were  given  robust  weights  of  zero,  and  all  5  were  at 
stations  recognized  as  having  incorrect  instrument  response  parameters  on  the  event  date.  All  other 
recordings  of  that  event  had  robust  weights  of  1.0.  The  small  changes  in  other  event  magnitudes  also 
appear  attributable  to  errors  in  instrument  responses. 
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As  mentioned  in  the  description  of  robust  reweighting,  for  such  a  solution  to  be  meaningful,  it 
must  be  consistent  in  the  face  of  changes  to  the  influence  function  parameters  (i.e.  a,  b,  and  c  of  (6)) 
and  to  the  use  of  alternate  influence  functions.  The  SCSN  robust  solutions  remained  consistent 
throughout  both  types  of  changes. 

To  summarize,  scrutiny  of  the  data  behind  the  parameters  most  affected  by  the  robust  reweighting 
indicates  that  the  technique  has  correctly  singled  out  and  downweighted  problematic  data  and  so 
improved  parameter  estimates,  and  has  alerted  us  to  poorly  constrained  parameter  values.  We  conclude 
that,  just  as  in  the  synthetic  experiments,  robust  reweighting  has  improved  the  estimates  of  site 
amplifications  for  the  SCSN. 

Incorporating  censored  data  into  parameter  estimates:  Previous  work  in  magnitude  estimation 

Ringdal  (1977)  incorporated  low  signal  level  information  into  magnitude  estimation  using 
maximum  likelihood  estimation.  Estimating  only  magnitude  and  variance,  he  was  able  to  simply 
examine  a  range  of  parameter  values  to  maximize  the  likelihood  function.  He  found  that  magnitude 
biases  of  0.5  magnitude  units  due  to  censoring  were  probable  for  teleseismic  events  recorded  on  a 
small  network,  and  were  correctable  by  incorporation  of  the  censored  data.  He  anticipated  further 
improvement  for  larger  networks.  Blandford  and  Shumway  (1982)  extended  that  work  to  include 
clipping  information  and  the  estimation  of  individual  station  biases  and  distance  corrections.  They 
maximized  the  likelihood  function  using  the  expectation  maximization  approach.  Our  approach 
expands  on  both  of  these  studies.  We  also  consider  doubly  censored  data,  although  we  use  an  inverse 
approach  to  solve  iteratively  a  linearized  version  of  the  set  of  equations  obtained  by  setting  the 
derivatives  of  the  likelihood  function  equal  to  zero. 

Derivation  of  the  Likelihood  Function 

The  problem  is  to  determine  the  set  of  site  amplifications  and  event  magnitudes  that  is  most  likely, 
given  the  set  of  amplitudes  we  have  measured.  To  do  so,  we  assume  that  the  noise  in  the  data  is 
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normally  distributed  so  that  we  can  determine  functions  that  describe  the  probability  of  having 
obtained  each  type  of  measurement  (upper  thresholds  for  clipped  data,  amplitude  measurements  for  on- 
scale  data,  and  lower  thresholds  for  below  noise  level  data)  in  terms  of  the  parameters  we  wish  to 
estimate,  namely  site  amplifications  and  event  magnitudes.  As  the  inaccuracy  of  the  assumption  of 
normality  has  been  amply  demonstrated,  we  use  robust  reweighting  to  solve  the  censored  problem,  but 
for  clarity  we  do  not  discuss  it  further  in  this  section.  The  likelihood  function  is  the  product  of  the 
probabilities  for  each  individual  measurement  and  so  represents  the  probability  of  having  obtained  a 
particular  entire  set  of  measurements,  in  terms  of  a  set  of  parameter  values  (e.g.  Montgomery  and  Peck, 
1982).  The  most  likely  set  of  parameter  values  is  that  set  which  maximizes  the  probability  of  having 
obtained  the  actual  measurements,  and  so  is  found  by  maximizing  the  likelihood  function. 

As  each  aj  j  is  a  random  variable  drawn  from  a  normal  distribution  with  mean  (ej+sj)  and  standard 
deviation  oj  j,  according  to  the  assumptions  made  in  writing  (2),  the  probability  of  obtaining  a 
particular  value  ai  j,  that  is,  the  probability  density  function  of  aj  j„  is 


P(au)  =  <P 


(10) 


where  <j>  is  the  standard  normal  distribution  function.  For  noisy  records,  the  probability  of  the  amplitude 
being  below  the  measured  noise  level  is  the  integral  of  the  distribution  (10)  over  the  range  of  values 
below  the  noise  level,  that  is, 
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where  jti,j  is  the  estimated  amplitude  threshold,  obtained  by  measuring  the  noise  level  over  the 
appropriate  time  window,  and  <I>(x)  is  the  cumulative  distribution  function  of  a  standardized  random 


variable,  i.e.  <£(.*)  =  I  e  2  dt .  Similarly,  for  clipped  records,  we  estimate  2ti,j»  a  maximum 


threshold.  Then,  the  equation, 


PKj  —2  h.j) 


~*i 


\ 


J 


2  Kj 


J 


~2  j  +2  £  j  +2$, 

■  J 

— oo 


ai,l-eJ~si 


da 

i,j 


(12) 


describes  the  probability  that  the  amplitude  is  above  the  threshold. 
The  likelihood  function  is 
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where  D  is  the  set  of  measured  amplitudes,  N  is  the  set  of  observations  in  which  the  signal  was  below 
the  noise  level,  and  C  is  the  set  of  clipped  records. 


Maximizing  the  likelihood  function  by  iteratively  solving  a  linearized  equation 

We  substitute  (10),  (1 1),  and  (12)  into  (13),  separate  the  products  into  sums  by  taking  the  natural 
logarithm  of  the  likelihood  function,  and  then  maximize  the  likelihood  function  by  setting  the 
derivatives  with  respect  to  ej  and  sj,  equal  to  0.  The  derivatives  with  respect  to  ej  are 
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where  the  summations  are  over  all  stations  that  recorded  the  event  j  on-scale  (first  term  on  the  right 
side),  stations  where  the  signal  was  below  the  noise  level  (second  term),  and  stations  that  clipped  (third 
term).  The  equations  for  derivatives  with  respect  to  the  station  amplification  parameters  are  similar  (the 
terms  are  identical,  but  the  summations  are  over  all  events  recorded  at  the  station).  In  the  second  and 
third  terms,  the  data  are  nonlinear  functions  of  the  parameters.  To  maximize  the  likelihood  function, 
we  linearize  (14)  and  use  a  starting  model  based  on  just  the  on-scale  recordings  (robustly  reweighted) 
to  find  iteratively  the  parameter  values  where  (14)  is  closest  to  zero. 

For  convenience,  we  call  the  derivative  of  the  log  of  the  likelihood  function,  (14),  Z(e),  where  e  is 
the  set  of  all  parameters,  (e1,e2, ....  ep,Sl,s2, ....  sq).  The  Taylor  series  expansion  of  one  element  of 

Z(e),  for  example  the  derivative  with  respect  to  ej  in  (14),  is 
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yrdZ: 

Z/e)  =  Z,-(e„)  +  X^i 
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The  summation  is  over  all  parameters  e.  The  only  non-zero  derivatives  of  Zj(e)  are  those  with  respect 
to  site  amplifications  at  the  stations  that  recorded  the  event  and  with  respect  to  ej . 

(15)  equals  zero  when  L  is  maximized,  so 
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is  the  equation  we  need  to  satisfy. 

The  derivative  of  Zj,  the  second  derivative  of  the  likelihood  function  (14),  with  respect  to  the 
event  ej,  is 
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where 


GW  =  ^,and  H(x)  =  ^f\. 
<£(*)  3>  (x) 


(18) 


The  summations  in  (17)  are  over  all  stations  recording  the  event,  on-  and  off-scale. 


The  second  derivatives  with  respect  to  station  amplifications  are  again  similar,  with  the  difference 
being  that  the  summations  are  over  all  events  recorded  at  the  station.  The  derivatives  with  respect  to 
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both  an  event  and  station  parameter  consist  of  just  the  single  term  in  (17)  containing  both  parameters 
ey  and  sx.  Whether  ax>y  is  an  on-scale  or  threshold  measurement  only  effects  which  summation  of  (17) 
the  term  is  drawn  from.  Successive  estimates  of  e0  are  calculated  from  (15),  until  they  don  t  vary 


significantly  from  the  previous  estimate.  (15),  expressed 

in  matrix  form  is 
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where  £(eo)  represents  the  log  of  the  likelihood  function,  evaluated  at  Cq. 

In  implementing  the  iterative  calculations  by  inversion  of  (19),  it  is  necessary  to  normalize  the 
standard  deviation  estimates  of  all  data.  For  the  least  squares  estimate  from  just  the  on-scale  data,  it 
was  sufficient  that  the  weights  be  proportional  to  the  inverse  standard  deviations,  but  the  arguments  of 
functions  in  (17)  depend  on  the  absolute  value  of  the  standard  deviations.  Hence,  all  standard  deviation 
estimates  should  be  scaled,  either  by  s  of  (4),  or  0  of  (9). 

The  practical  effects  of  iterative  reweighting  to  incorporate  censored  data 

For  a  set  of  data  with  events  recorded  on  overlapping  sets  of  stations,  simultaneously  calculating 
site  amplifications  and  event  magnitudes  serves  the  same  purpose  as  iterative  reweighting  for 
censoring.  This  was  noted  by  Ringdal  (1977),  who  demonstrated  the  effect  of  censoring  on  small  data 
sets  where  the  type  of  measurement  precluded  estimating  station  corrections  (he  measured  teleseismic 
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peak  amplitudes,  which  are  sensitive  to  backazimuth  at  the  station  and  source  radiation  patterns).  In 
tests  for  the  case  of  single  censoring,  both  for  synthetic  data  from  a  10  station  network  and  100  events, 
and  with  data  from  13  WWSSN  stations  and  61  events,  Ringdal  showed  that  a  bias  of  0.5  Mb  units  was 
a  realistic  effect  of  estimating  magnitudes  of  teleseismic  events  from  censored  data  without  estimating 
station  corrections.  In  the  SCSN  problem  we  are  able  to  simultaneously  calculate  magnitudes  and  site 
amplifications,  as  rms  amplitude  measurements  of  near-source  scattered  coda  are  insensitive,  or  at  least 
much  less  sensitive,  to  event  azimuth.  This  does  not  mean  that  the  censored  data  are  entirely  redundant, 
as  the  data  are  both  limited  and  noisy. 

For  the  SCSN  study,  there  proved  to  be  sufficient  overlapping  data  that  the  iterations  for  censoring 
had  little  first  order  effect  on  most  parameter  values.  Changes  in  magnitude  estimates  were  on  the 
order  of  1%  or  less.  There  were  some  greater  changes  in  site  amplification  values  (table  2).  For 
application  of  site  amplifications,  we  use  only  those  sites  with  at  least  10  on-scale  recordings  (Baker, 
et.  al.,  1996),  but  to  illustrate  the  importance  of  censored  data  on  parameters  dependent  on  fewer  data, 
we  present  all  site  amplifications  with  more  than  a  5%  change  in  their  values  due  to  the  incorporation 
of  censored  data  (only  one  had  more  than  10  on-scale  recordings).  Note  that  one  site  amplification  (the 
first  listed)  increased  by  5.8%,  despite  having  no  censored  data.  The  magnitude  of  each  event  that 
station  recorded  was  reduced  in  magnitude  by  the  incorporation  of  censored  data.  This  indicates  the 
surprising  importance  of  the  very  small  changes  in  magnitude  estimates  to  site  amplification  estimates 
when  few  data  are  available.  While  the  effect  of  censored  data  proved  to  be  small  for  this  problem,  we 
consider  the  changes  to  be  an  improvement  in  the  accuracy  of  the  site  amplifications  calculated.  This 
exercise  also  provides  some  insight  into  the  extent  of  the  effect  of  censoring  in  a  problem  with  many 
data  and  many  parameters.  Although  the  noise  level  of  the  data  and  appropriateness  of  the  model  are 
important  and  will  make  prediction  of  the  effect  of  censoring  difficult,  this  result  indicates  that  for  any 
problem  with  parameters  dependent  on  few  data,  especially  if  the  extent  of  overlap  of  parameter  values 
is  small,  ignoring  censoring  could  lead  to  significant  error. 


Table  5.2:  Differences  between  site  amplifications  for  the  SCSN  with  and  without 
incorporating  censored  data. 
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Number  of 

Number  of 

Number  of 

Censored 

Uncensored 

Difference 

on-scale 

below-noise 

clipped 

estimate 

estimate 

between 

recordings 

recordings 

recordings 

estimates 

6 

0 

0 

0.54 

0.57 

5.8% 

16 

2 

0 

1.13 

1.05 

7.2% 

2 

6 

1 

1.23 

1.11 

10.3% 

3 

2 

0 

0.98 

0.87 

11.3% 

1 

1 

0 

0.47 

0.54 

14.2% 

4 

4 

0 

0.61 

0.72 

16.8% 

3 

8 

11 

0.74 

0.86 

17.2% 

Conclusions 

We  have  demonstrated  with  both  synthetic  and  real  data  examples  how  robust  statistics 
significantly  improve  the  accuracy  of  magnitude  and  site  amplification  estimation.  This  works  by 
reducing  the  influence  of  outliers  in  data  drawn  from  heavy  tailed  distributions,  and  so  should  be 
applicable  to  a  wide  variety  of  geophysical  problems.  The  robust  reweighting  not  only  automatically 
throws  out  very  large  outliers  from  data  sets  too  large  to  permit  more  than  spot-checking  of  outliers, 
but  downweights  moderate  outliers  so  that  the  least  squares  criterion  does  not  allow  a  single  datum  or  a 
few  data  to  bias  a  solution  significantly.  It  also  greatly  increases  the  accuracy  of  uncertainty  estimates 
for  a  least  squares  problem.  Its  efficiency  and  ease  of  implementation  make  it  an  attractive  and  sensible 
choice  for  improving  geophysical  parameter  estimates. 

We  also  find  that  parameter  estimates  are  improved  through  the  incorporation  of  censored  data  via 
maximum  likelihood  estimation.  This  was  previously  demonstrated,  and  our  contributions  are  1)  to 
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derive  and  present  all  equations  necessary  to  incorporate  censored  data  into  estimates  of  magnitude  and 
station  amplifications,  and  2),  to  confirm  its  usefulness  on  a  very  large  data  set,  where  overlapping 
events  and  stations  might  have  lead  us  to  suspect  that  the  censored  data  would  not  be  important.  Again, 
we  expect  this  technique  could  be  applied  with  profit  to  other  areas  of  geophysics  where  signals  are 
often  outside  the  range  of  the  recording  instruments. 
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Appendix  5.1 


Validity  of  a  priori  weighting  by  ratios  of  signal  to  pre-event  noise 


For  the  SCSN  problem,  we  made  an  initial  estimate  of  each  <Jjj  based  on  the  signal  to  pre-event 
noise  ratio.  Pre-event  noise  may  be  a  major  source  of  error  in  the  measurements  of  amplitudes,  but 
statistical  variability  is  likely  due  to  more  than  additive  background  noise.  For  example,  as  mentioned 
earlier,  scattering  into  the  coda  may  be  distinctly  azimuthally  dependent  at  some  stations  or  a  radiation 
pattern  nodal  line  for  some  event  might  cross  the  network  so  that  incoming  energy  varies  significantly. 
Since  we  don’t  know  a  priori  which  amplitude  measurements  will  be  affected,  the  best  we  can  do  for 
an  initial  estimate  of  the  standard  deviation  is  to  use  the  ratio  of  signal  amplitude  to  pre-event  noise, 
but  recognizing  that  there  will  be  other  contributions  to  the  overall  error  process,  we  should  place  a 
minimum  on  the  permissible  estimate  of  CTj  j,  We  note  that  although  seismic  pre-event  noise  is  additive, 
we  expressed  the  uncertainty  in  the  amplitude  as  a  multiplicative  factor  in  (1).  This  expression,  which 
permits  separation  of  all  the  factors  in  (1)  by  taking  the  logarithm,  could  prove  difficult  to  justify 
theoretically,  but  may  provide  an  adequate  empirical  model  of  the  noise.  We  relate  the  effect  of  the 
ratio  of  the  signal  to  pre-event  noise,  to  T  of  (1),  as  follows.  As  the  signal  and  noise  are  assumed  to  be 
independent,  the  mean  squared  sum  of  their  amplitudes  should  be  additive,  and  so  we  can  write 


42  =  A2+A2 


1  + 
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2\ 


A2, 
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Hence, 


(A2) 
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We  set  the  minimum  permissible  value  of  cy  at  log(1.2),  which  is  done  in  practice  by  taking 
bij  =log(l.2-ri  ;).  When  the  signal  and  noise  levels  are  equal,  we  have  a  minimum  weight  of 

[log(1.2  •  V2)]_1.  The  effective  maximum  weight  of  [log(1.2)]  ^  is  approached  when  the  signal 

to  noise  ratio  becomes  very  large  (at  a  signal  to  noise  amplitude  ratio  of  about  8  to  10,  the  differences 

between  weights  become  fairly  small). 

The  robust  reweighting  provides  us  with  the  opportunity  to  examine  the  validity  of  this  signal  to 
pre-event  noise  ratio  based  a  priori  weighting.  We  cannot  generally  do  this  for  real  data,  as  we  don’t 
know  the  true  solution.  It  turns  out  however,  that  the  robust  solution  is  nearly  independent  of  whether 
we  initiate  the  robust  iterations  with  the  weighted  or  unweighted  least  squares  solution.  The  difference 
between  magnitude  estimates  starting  with  the  weighted  vs.  the  unweighted  least  squares  estimates  was 
less  than  0.3%  for  all  events.  One  site  amplification  varied  by  nearly  4%,  one  by  2%,  15  varied 
between  1%  and  2%,  and  the  remaining  194  varied  by  less  than  1%.  If  we  accept  that  the  robust 
solution  is  closer  to  the  true  solution,  it  is  worth  asking  whether  the  a  priori  weights  we  chose  brought 
us  closer  to  that  solution.  We  can  see  from  Table  A1  that  they  did.  For  every  site  amplification  for 
which  the  a  priori  weighted  and  unweighted  least  squares  solutions  varied  by  more  than  5%,  the 
weighted  solution  was  closer  to  the  robust  solution  (Table  Al). 
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Table  5A.1:  Differences  between  unweighted  and  a  priori  weighted  least  squares  site 
amplification  estimates.  The  robust  estimates  are  effectively  the  same  whether  we  begin  with 
or  without  a  priori  weighting,  and  we  see  that  the  a  priori  weighted  values  are  always  closer 
to  the  presumably  more  accurate,  robustly  weighted  values. 


Difference 

Number  of 

Unweighted 

Weighted  least 

Robustly 

between 

events  recorded 

least  squares 

squares 

weighted  least 

estimates 

at  each  station 

estimate 

estimate 

squares 

estimate 

5.1% 

10 

1.49 

1.57 

2.13 

5.1% 

24 

0.72 

0.76 

0.81 

5.5% 

18 

0.45 

0.48 

0.68 

5.7% 

26 

0.62 

0.65 

0.72 

8.0% 

16 

0.80 

0.87 

1.05 

8.6% 

34 

0.59 

0.65 

0.68 

14.5% 

19 

0.57 

0.67 

1.12 

24.8% 

37 

0.46 

0.61 

0.60 

28.2% 

28 

0.51 

0.71 

0.68 

28.5% 

18 

0.78 

1.09 

1.05 

Chapter  6 


Diffuse  coda  site  amplifications  in  southern  California 
and  the  nature  of  Lg  waves 


Abstract 

We  use  near-receiver-scattered  teleseismic  coda  ( diffuse  coda)  to  calibrate  the  amplification  of  189 
southern  California  seismic  network  (SCSN)  stations.  This  calibration  is  done  to  enable  the  estimation 
of  variations  in  the  absolute  amplitude  of  the  crustal  seismic  phase  Lg.  We  investigate  the  basis  for  the 
assumption  that  diffuse  coda  provides  an  isotropic  source  of  Lg-like  energy  through  a  review  of 
previous  research  on  both  teleseismic  coda  and  Lg.  We  also  investigate  which  parameters  influence  site 
amplifications,  and  to  what  extent.  The  results  of  this  analysis  provide  the  basis  for  the  design  of  an  Lg 
propagation  study.  Specifically,  we  discuss  how  best  to  control  factors  important  to  site  amplification  so 
that  observed  Lg  amplitude  variations  may  be  attributed  fully  to  path  effects. 

To  separate  the  diffuse  from  the  coherent  (near-source  scattered)  component  of  teleseismic  coda, 
we  remove  the  network  beam  from  each  individual  trace.  The  site  amplifications  and  signal  magnitudes 
are  simultaneously  estimated  using  a  maximum  likelihood  approach  for  doubly  censored  data,  with 
robust  re-weighting  (Baker  and  Minster,  1996).  The  insensitivity  of  site  amplifications  to  small 
differences  in  wavetype  is  verified  by  the  high  correlation  of  those  calculated  from  deep  event  coda  and 
those  calculated  from  shallow  event  coda.  The  diffuse  coda  site  amplifications  are  also  well  correlated 
with  site  amplifications  determined  from  S-wave  coda  of  local  events  (Su  and  Aki,  1995). 
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Introduction 

We  have  isolated  the  near-receiver  scattered  component  of  teleseismic  coda  and  used  it  to  calibrate 
site  amplifications  at  SCSN  stations  (figure  1).  Instrument  gains  were  calculated  for  all  stations  at  the 
time  of  each  event  (Wald  et.  al.,  1994),  and  data  were  not  used  for  stations  and  times  for  which  any 
instrument  constants  were  not  known. 

There  are  three  distinct  stages  to  this  work,  which  we  discuss  separately.  We  first  consider  the 
nature  of  teleseismic  coda  and  Lg  and  discuss  the  implications  for  the  design  of  an  Lg  propagation 
study.  We  then  describe  the  processing  required  to  separate  diffuse  from  coherent  coda.  Finally,  we 
examine  the  site  amplifications  themselves.  We  compare  the  diffuse  coda  site  amplifications  calculated 
from  just  deep  event  coda  with  those  calculated  from  shallow  event  coda.  We  also  compare  the  diffuse 
coda  site  amplifications  calculated  from  all  events  with  site  amplifications  from  local  S-wave  coda,  and 
consider  implications  of  the  differences. 

Motivation  -  Isolating  site  effects  from  propagation  effects  on  absolute  amplitudes. 

High  frequency  regional  discriminants  for  nuclear  verification  fail  quite  often.  That  is,  they  classify 
an  explosion  as  an  earthquake  or  vice-versa  (e.g.  Taylor  et.  al.,  1989).  Some  such  errors  may  be  due  to 
truly  odd  sources  or  to  near-source  scattering,  but  much  of  the  misclassification  is  a  result  of  changes  in 
the  signal  due  to  structure  along  the  path  of  propagation  (e.g.  Baker  and  Minster,  1995).  Thus,  these 
changes  may  be  predictable.  Zhang  et  al.  (1994)  showed  that  some  variation  of  Lg  amplitudes,  relative 
to  other  regional  phases,  are  reasonably  well  correlated  with  measurable  properties  of  the  path,  such  as 
statistics  of  topography.  That  study  was  based  on  just  7  stations  spread  out  across  Eurasia,  and  the  path 
lengths  were  one  or  two  orders  of  magnitude  greater  than  the  scale  at  which  Lg  has  been  observed  to 
disappear  entirely.  For  example,  the  Lg/Pg  amplitude  ratio,  the  best  single  high  frequency  regional 
discriminant  (Taylor,  et.  al.,  1989),  has  been  observed  to  change  dramatically  over  just  20  km  distance 
for  regional  events  recorded  on  the  SCSN  (Baker  and  Minster,  1995).  To  better  understand  the  physics 
of  problem,  we  must  quantify  the  variations  in  Lg  amplitude  with  path  properties,  such  as  was  done  by 
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Figure  6.1:  Stations  of  the  southern  California  seismic  network  for  which  site  amplifications  were  calculated. 
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Zhang  and  Lay  (1994),  but  at  a  densely  spaced  network  such  as  the  SCSN.  We  must  also  resolve  the 
ambiguity  as  to  which  phase  varies,  Lg  or  Pg,  which  is  why  Lg  site  amplifications  are  necessary.  To 
calibrate  each  site  it  would  be  ideal  to  have  an  isotropic  source  of  Lg  of  known  amplitude.  We  use  the 
diffuse  teleseismic  coda  to  approximate  that  ideal. 

What  controls  site  amplifications? 

To  determine  whether  diffuse  coda  is  an  appropriate  surrogate  for  Lg,  we  first  must  consider  what 
characteristics  of  a  wave  influence  its  amplification.  Amplification  has  long  been  recognized  to  depend 
on  the  impedance  of  the  uppermost  layer  (e.g.  Gutenberg,  1957).  In  southern  California,  the  mean 
amplification  for  large  numbers  of  sites  binned  by  sediment  age  correlate  well  with  the  age,  which 
presumably  correlates  with  impedance  (Su  and  Aki,  1995).  There  is  significant  variation  within  each 
bin  however,  which  suggests  that  there  are  important  secondary  factors  influencing  amplification. 
Those  factors  could  be  a  variety  of  things,  many  of  which  are  difficult  to  measure.  For  example,  the 
effects  on  surface  amplification  of  focusing  and  de-focusing  of  various  surface  topographies  have  been 
modeled  and  shown  to  be  significant.  Specifically,  Kawase  (1988)  modeled  the  effects  of  canyon 
topography,  Bouchon  (1973)  modeled  the  effects  of  mountain  topography,  and  Trifunac  (1971) 
modeled  the  effects  of  an  alluvium  filled  valley.  Benites  and  Aki  (1989)  simulated  the  effect  of  small- 
scale,  near-surface  heterogeneity  on  surface  amplification,  and  found  that  both  higher  and  lower 
impedance  inclusions  de-amplify  the  surface  motion.  The  harder  inclusions  do  so  by  scattering  and  de- 
focusing  the  incoming  energy,  whereas  the  softer  inclusions  do  so  by  trapping  energy  within  the 
inclusions,  where  it  eventually  attenuates  as  it  resonates. 

The  layer  thickness  to  which  a  signal  is  sensitive  will  depend  on  its  frequency.  In  media  whose 
impedance  increases  with  depth,  higher  frequencies  should  have  greater  amplification.  This  frequency 
dependence  has  also  been  observed  (e.g.  Gutenberg,  1957). 

Figure  2  illustrates  that  the  incidence  angle  at  the  surface  should  also  be  a  factor  in  amplification 
(e.g.  Aki  and  Richards,  1980).  This  curve  is  valid  for  any  velocity  medium.  In  practice  however,  the 
incidence  angle  is  likely  to  be  highly  variable,  and  unpredictable.  For  example,  Vernon  et  al.  (1991) 
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At  Free  Surface  for  Incident  P 


At  Free  Surface  for  Incident  S 


Figure  6.2:  Vertical  amplification  at  a  free  surface  for  incoming  P-waves  (top)  and  S-waves  (bottom) 
as  a  function  of  incidence  angle. 
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examined  data  from  a  surface  array  and  a  borehole  instrument  placed  in  a  nearly  ideal  locale,  with  a 
planar  surface  a  few  meters  above  granite  bedrock.  Vertically  incident  P-waves  were  much  more 
coherent  between  300  and  150  meters  depth  (from  the  borehole  recordings)  than  they  were  between  150 
meters  depth  and  the  surface.  Further,  the  energy  in  coda  is  especially  highly  scattered.  Vernon  et  al. 
(1991)  found  surface  recordings  of  local  event  S-waves  to  be  incoherent  at  little  more  than  one 
wavelength  spacing.  Another  array  study  in  the  same  location  found  that  local  S-wave  coda  is 
incoherent  at  less  than  one  wavelength  distances,  and  that  most  of  the  coda  energy  appears  to  have  been 
scattered  very  near  the  array  (Wagner  and  Owens,  1993).  Although  knowing  the  incidence  precisely  for 
any  arrival  is  unlikely,  it  is  probable  that  a  statistical  difference  in  incidence  at  the  surface  would  exist 
between  steeply  incident  teleseismic  coda,  and  locally  scattered,  more  horizontally  traveling  coda,  and 
so  they  may  have  different  amplifications. 

Resonance  is  commonly  observed  in  sedimentary  basins  (e.g.  Hough,  et  al.,  1992,  Milana,  et  al., 
1996)  and  could  be  a  factor  in  site  amplifications,  especially  for  coda.  Site  amplifications  however,  for 
coda  waves  have  been  shown  to  correlate  well  with  azimuthally  averaged  local  S-waves  (Tsujiura, 
1978),  which  should  not  be  subject  to  resonance,  so  resonance  may  not  be  a  factor  in  amplifications  for 
most  sites.  Also,  local  S-wave  coda  site  amplifications  for  different  frequencies  correlate  well,  with  a 
few  notable  exceptions  in  large  sedimentary  basins.  We  discuss  this  further  in  the  section  comparing 
diffuse  and  local  coda  site  amplifications. 

In  addition  to  these  considerations,  the  problem  of  site  amplification  estimation  is  inherently 
underdetermined,  so  only  relative  site  amplifications  can  be  determined.  If  site  amplifications  were 
estimated  for  two  different  types  of  signals  at  the  same  frequency  (e.g.  steeply  incident  P-waves,  and 
surface  waves),  with  both  sets  of  signals  being  internally  consistent  in  ray  parameter  and  wave  type,  the 
relative  site  amplifications  should  be  the  same.  In  order  to  compute  useful  site  amplifications,  the  most 
important  factor  to  control  is  the  frequency  of  the  calibration  signal  and  the  signal  of  interest.  The  most 
common  passband  for  the  study  of  Lg  is  around  0.3  to  6  Hz  (e.g.  Zhang  et  al.,  1994).  This  effectively 


134 


enhances  Lg  relative  to  lower  frequency  fundamental  mode  Rayleigh  wave  energy  and  higher  frequency 
noise.  We  have  calculated  site  amplifications  from  diffuse  coda  in  the  same  passband. 

We  next  consider  the  nature  of  Lg  and  diffuse  coda.  We  first  review  previous  work  on  both. 
Modeling  of  Lg  blockage  was  thoroughly  discussed  in  chapter  4  of  this  thesis.  Here  we  review  more 
general  studies  of  Lg.  We  also  discuss  possible  differences  and  similarities  in  the  sources  of  earthquake 
and  nuclear  explosion  Lg,  and  the  source  of  diffuse  coda. 

What  is  Lg? 

The  Lg  phase  was  recognized  early  in  the  history  of  modem  regional  seismology  (Press  and  Ewing, 
1952)  as  it  is  often  the  largest  phase  on  regional  high  frequency  seismograms.  The  names  Lg  and  Rg  are 
historical,  coming  from  the  initial  guess  that  these  were  high  frequency  Love  and  Rayleigh  waves 
traveling  in  a  thin  near  surface  granite  layer.  Bath  (1954)  later  argued  that  the  phases  propagate  in  a  low 
velocity  channel  within  the  mid-crust.  Knopoff,  et  al.  (1973)  demonstrated  theoretically  that  the  entire 
crust  provided  the  waveguide  and  no  low- velocity  layer  was  required.  Oliver  and  Ewing  (1957,  1958) 
first  recognized  that  Lg  and  Rg  are  composed  of  higher  mode  surface  waves.  Cara  et  al.  (1981)  resolved 
higher  mode  Rayleigh  waves  within  Lg,  at  two  to  five  second  period,  and  Wagner  and  Owens  (1995) 
did  so  at  approximately  2  Hz.  In  current  usage,  Lg  refers  to  the  energy  on  all  three  components  of 
ground  motion,  typically  arriving  between  3.5  to  3.0  km/sec  group  velocity.  Rg  has  come  to  refer  to 
high  frequency  (<  3  second  period)  fundamental  mode  Rayleigh  waves  (e.g.  Lay  and  Wallace,  1995). 

Many  of  the  distinguishing  properties  of  Lg  have  to  do  with  its  indeterminate,  scattered  nature.  Lg 
is  usually  composed  of  sufficiently  scattered  energy  that  it  is  impossible  to  identify  any  arrival  that  has 
followed  a  specific  raypath.  That  same  scattered  nature  of  Lg,  which  makes  it  so  useful  as  a  stable 
measure  of  magnitude  and  yield  within  a  calibrated  region  (e.g.  Nuttli,  1986),  also  makes  it  difficult  to 
model  in  a  deterministic  manner.  The  common  practical  definition  of  Lg  as  whatever  energy  arrives 
within  some  particular  group  velocity  window,  further  reflects  its  disordered  character.  That  is, 
although  there  is  a  ray  description  of  Lg  as  multiple  S-wave  reverberations,  it  is  not  a  practical 
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definition,  as  the  discrete  phases  SmS,  2SmS,  etc.,  are  rarely  observed.  Vogfjord  and  Langston  (1996) 
were  able  to  distinguish  discrete  S-wave  multiple  arrivals,  but  their  resolution  required  beamforming  at 
a  large  array.  One  of  the  first  features  of  Lg  that  modelers  attempted  to  duplicate  with  synthetics  was  its 
disordered,  incoherent  character.  Bouchon  (1982)  demonstrated  that  one  way  the  indeterminate  nature 
of  Lg  could  be  reproduced  was  by  summing  sufficient  rays  reverberating  off  all  layers  in  a  simple  four 
layer  crustal  model. 

A  related  characteristic  of  Lg  that  has  received  attention  is  the  long  duration  of  its  coda.  Aki  (1969) 
described  local  earthquake  coda  by  single  scattering  off  small  scale  crustal  heterogeneities.  This  idea, 
and  its  successor,  multiple  scattering,  have  been  useful  in  explaining  the  indeterminate  nature  of  coda  of 
many  phases  besides  Lg  (e.g.  Wu  and  Aki,  1988).  The  importance  of  not  just  scattering,  but  of  mode 
conversions,  in  explaining  Lg  coda  was  inferred  from  array  observations  (Der,  et  al.,  1984).  Bouchon 
and  Coutant  (1994)  used  synthetic  seismograms  to  illustrate  the  potential  importance  to  Lg  coda 
duration  of  scattering  due  to  Moho  roughness.  They  also  suggest  that  crustal  heterogeneities  play  such 
an  important  role  in  scattering  that  the  backscattered  wavefield  in  Lg  could  be  used  to  map 
heterogeneity. 

Another  long  perplexing  characteristic  of  Lg,  the  large  amplitude  of  transverse  Lg  from  explosion 
sources,  provides  further  evidence  of  the  importance  of  scattering  to  this  phase.  Explosions  should 
generate  largely  compressional  waves  (e.g.  Masse,  1981).  When  those  P-waves  scatter  from 
discontinuities  in  a  homogeneous,  isotropic,  plane  layered  structure,  there  should  be  some  conversion  to 
SV,  but  not  SH.  The  amplitude  increase  of  transverse  Lg  relative  to  the  vertical  and  radial  components, 
with  distance  from  the  source,  continues  until  all  three  components  are  roughly  equal  in  power.  This 
suggests  constant  scattering  and  interconversion  of  Lg  energy  as  it  propagates  (e.g.  Gupta  and 
Blandford,  1983).  Velocity  heterogeneities  in  the  crust  provide  one  possible  explanation  for  transverse 
Lg  (Gupta  and  Blandford,  1983).  Maupin  (1990)  demonstrated  that  anisotropy  could  also  be  an 
effective  means  of  scattering  of  SV  to  SH. 
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While  the  existence  of  transverse  Lg  from  explosion  sources  indicates  something  about  Lg’s 
propagation,  the  very  existence  of  Lg  from  explosions  indicates  something  about  Lg  generation.  One 
explanation  of  explosion  Lg  bears  directly  on  the  nature  of  diffuse  coda  and  its  similarity  to  Lg.  Gupta 
et  al.  (1992)  note  that  as  nuclear  explosions  are  buried  in  a  low  velocity  surface  layer,  they  produce  a 
significant  amount  of  Rg  energy.  They  argue  that  Rg  is  short-lived,  scattering  to  Lg  off  surface 
irregularities  and  shallow  layer  heterogeneities,  as  well  as  dissipating  due  to  strong  anelasticity  in  the 
surface  layer.  Another  possible  mechanism  for  explosion-generated  Lg  is  the  direct  conversion  of  P-  to 
S-wave  energy,  which  is  also  dependent  on  the  explosion  occurring  within  a  low  velocity  surface  layer 
(Gutowski,  etal.,  1984). 

The  spectra  of  explosion  and  earthquake  Lg  phases  are  different,  with  more  high  frequency  energy 
in  the  earthquake  spectra,  starting  at  approximately  2  Hz.  The  deficiency  of  high  frequencies  in 
explosion  Lg  may  be  due  to  both  the  scattering  of  Lg  from  Rg,  which  is  lower  frequency  than  the 
original  source  spectra,  and  to  greater  attenuation  of  high  frequencies  in  the  surficial  layer  where  the 
generation  of  Lg  occurs  (e.g.  Goldstein,  1995).  The  spectral  content  and  amplitude  of  Lg  also  varies 
with  explosion  depth  (e.g.  Campillo,  et  al.;  1984,  Goldstein,  1995),  while  earthquake  source  depth 
controls  the  relative  excitation  of  different  surface  wave  modes  (e.g.  Harkrider,  1970;  Campillo,  et  al., 
1984).  Such  depth  dependence  could  affect  the  dominant  phase  and  group  velocities  of  Lg,  although  to 
our  knowledge,  no  such  dependence  has  been  observed.  Source  spectra  also  vary,  probably  due  to  a 
combination  of  other  source  properties. 

Wagner  and  Owens  (1995)  performed  3-component  broadband  array  analysis  of  particle  motions  in 
conjunction  with  more  standard  array  processing  techniques  to  provide  further  information  on  the 
nature  of  Lg.  They  examined  Lg  from  two  nuclear  explosions  and  one  earthquake  from  within  the 
Nevada  Test  Site  (NTS),  as  recorded  at  a  6  km  aperture  3-component  broadband  array  at  Piiion  Flat 
Observatory  in  southern  California,  and  made  a  number  of  observations  important  to  the  design  of  an 
Lg  propagation  study.  The  highest  frequency  energy  arrives  in  the  earliest  part  of  the  Lg  wavetrain,  at 
3.6  to  3.4  km/sec  group  velocity.  Distinct  Love  and  Rayleigh  waves  at  2.5  and  1.8  Hz  respectively. 
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were  distinguished  arriving  simultaneously  in  that  early,  most  prominent  part  of  a  nuclear  explosion  Lg 
phase,  confirming  the  description  of  Lg  as  higher  mode  surface  waves.  The  entire  Lg  wavetrain 
contained  considerable  forward  scattered  and  multi-pathed  energy,  with  significant  backscattering 
beginning  at  approximately  3.0  km/sec  group  velocity.  Much  of  the  energy  in  the  latter  part  of  the  Lg 
wavetrain  could  not  be  modeled  as  plane  waves,  indicating  that  it  was  scattered  from  very  near  the 
array. 

What  is  diffuse  coda  and  is  it  an  isotropic  source  of  Lg-like  energy? 

Teleseismic  coda  is  generally  understood  to  consist  of  the  sum  of  near-source  and  near-receiver 
scattered  energy  (e.g.  Langston,  1989).  The  coherent  component  of  teleseismic  coda  consists  of  energy 
scattered  into  P-waves  near  the  source,  that  travel  along  a  path  similar  to  that  of  the  initial  P  wave 
(Bannister,  et  al„  1990,  Dainty,  1990).  The  diffuse  component  of  teleseismic  coda  consists  of  energy 
scattered  near  the  receiver  from  the  incoming  P  wave,  and  travels  with  much  lower  velocity,  for 
example,  approximately  3.5  km/sec  near  NORESS  (e.g.  Bannister  et  al.,  1990,  Dainty,  1990).  The 
diffuse  coda  is  composed  primarily  of  nearly  horizontally  propagating  shear  waves  trapped  in  the 
crustal  waveguide,  that  is,  fundamental  and  higher  mode  surface  waves  (e.g.  Dainty,  1990).  The 
teleseismic  coda  of  deep  focus  earthquakes  consists  largely  of  diffuse  coda  (e.g.  Dainty,  1990, 
Revenaugh,  1995a),  and  so  may  provide  the  multi-azimuthal  source  of  energy  appropriate  for  Lg  site 
calibration.  By  separating  the  diffuse  and  coherent  components,  we  are  also  able  to  utilize  shallow  event 
coda. 

Literature  on  the  nature  of  diffuse  teleseismic  coda  is  less  extensive  than  that  of  Lg,  largely  because 
its  diffuse  character  requires  that  array  data,  which  are  not  nearly  as  common  as  single  station  data,  be 
extensively  processed  for  anything  but  trivial  observations  to  be  made.  Most  array  studies  have 
concentrated  on  the  identification  of  discrete  single  scatterers  of  P  to  Rg  near  arrays  (Key,  1967, 
Bannister,  et  al.  1990,  Gupta,  et  al.  1990,  Hedlin,  et  al.  1991).  Dainty  and  Harris  (1989)  looked  directly 
at  the  nature  of  teleseismic  coda,  and  determined  that  the  near-receiver  scattered  component  was  made 
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up  of  incoherent  S-wave  energy.  Rg,  even  from  single  identifiable  scatterers,  does  not  dominate  the 
coda.  Individual  scattered  Rg  phases  are  only  infrequently  distinguishable  in  the  raw  seismograms,  and 
many  records  must  be  processed  and  stacked  for  identification  of  the  most  prominent  scatterers. 
Revenaugh  (1995a, b)  notes  that  the  identification  of  individual  scatterers  associated  with  topography 
has  been  limited  to  those  with  vertical  scale  lengths  on  the  order  of  half  the  seismic  wavelength  used. 
The  scatterers  must  also  be  extremely  close  to  the  array,  as  Rg  is  a  very  short-lived  phase  due  to  Us 
dependence  on  the  near  surface  velocity  structure,  which,  especially  in  southern  California,  is 
heterogeneous  and  highly  attenuative.  In  the  first  such  study,  Key  (1967),  identified  a  scatterer  13  km 
from  the  Eskdalemuir  array  in  Scotland.  Bannister  et  al.  (1990),  using  a  semblance  technique,  identified 
2  scatterers  10  and  30  km  from  the  NORESS  array  in  Scandinavia.  They  looked  for  Rg  scattered  from 
the  largest  topographic  feature  in  the  area,  a  mountain  range  between  100  and  300  km  distance  from  the 
array,  but  saw  no  such  scattered  energy.  They  were  also  unable  to  discern  scattered  energy  from 
mountainous  areas  60  km  to  the  north  and  west  of  the  array.  Gupta  et  al.  (1990)  also  imaged  the 
scatterer  30  km  distant  from  NORESS  using  f-k  analysis  and  Hedlin  et  al.  (1991)  imaged  both  scatterers 
using  a  beam  deconvolution  technique  and  migration,  but  neither  were  able  to  resolve  more  distant 

scatterers. 

The  lack  of  Rg  from  anything  but  very  near-receiver  scatterers  was  also  observed  by  Revenaugh 
(1995a),  who  used  the  SCSN  to  investigate  the  contribution  from  large  areas  of  moderately  efficient 
scattering  to  near-receiver  scattered  teleseismic  coda.  The  contribution  of  any  coherent  sources  of  Rg  to 
the  energy  in  a  particular  seismogram  was  very  small.  Using  migration,  he  obtained  his  best  image  of 
topographic  scattering  efficiency  at  a  group  velocity  of  2.9  km/sec  (which  does  not  imply  that  most 
diffuse  coda  energy  travels  at  that  group  velocity,  only  that  the  most  coherent  energy  does). 

Revenaugh  (personal  communication,  1996)  observed  that  coherent  Rg  typically  travels  no  more 
than  50  km  in  southern  California.  Rg  presumably  does  not  completely  attenuate  intrinsically,  but 
scatters  into  other  phases.  This  recalls  the  argument  discussed  earlier  regarding  the  source  of  nuclear 
explosion  Lg  as  being  due  to  near-source,  surface  scattering  of  P  to  Rg,  and  subsequent  scattering  of  Rg 
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to  Lg.  This  suggests  that  diffuse  coda,  especially  that  arriving  more  than  10  to  15  seconds  after  the 
initial  P  arrival,  may  be  largely  composed  of  Lg-like  energy.  Bannister  et  al.  (1990)  examined  two  coda 
windows  in  their  study  at  NORESS,  and  their  findings  confirm  this  suggestion.  They  only  observed 
scattered  Rg  energy  in  the  earlier  time  window,  covering  the  first  30  seconds  after  the  initial  P  arrival. 
This  energy  accounted  for  10%  to  30%  of  the  signal,  and  the  coherent  coda  accounted  for  30%  to  50% 
The  other  approximately  40%  of  the  signal  was  attributed  to  diffuse  scattering  at  the  receiver  end.  The 
authors  also  found  that  illumination  of  scatterers  was  somewhat  dependent  on  the  azimuth  of  the 
original  signal.  The  second  coda  window,  from  35  to  70  seconds  after  P ,  was  dominated  by  energy  with 
S-wave  phase  velocities,  more  diffusely  scattered  than  the  earlier  Rg. 

Bannister  et  al.  (1990)  conclude  that  the  dominance  of  S-waves  in  later  coda  is  due  to  direct  P-to-S 
scattering  at  greater  distance  from  the  receiver,  which  persists  because  S  attenuates  so  much  more 
slowly  than  Rg.  This  is  reminiscent  of  the  mechanism  of  P  to  S  surface  layer  scattering  proposed  by 
Gutowski  et  al.  (1984).  For  our  purposes,  it  is  not  important  whether  the  coda  comes  from  P  to  Rg 
scattering,  and  subsequent  scattering  of  Rg  to  higher  mode  surface  waves,  or  from  direct  P  to  S 
scattering.  The  important  fact  is  the  unanimity  of  array  studies  in  identifying  near-receiver  scattered 
coda,  much  more  than  10  seconds  after  the  initial  P-wave,  as  diffuse  S-wave  energy. 

Revenaugh  (1995b)  also  uses  migration  to  image  the  uppermost  mantle  using  P  to  P  and  P  to  S 
scattering  beneath  the  SCSN.  This  required  twice  as  many  data  as  the  P  to  Rg  scattering  (Revenaugh 
1995a),  as  near-receiver  upper  mantle  body  wave  to  body  wave  scattering  provides  a  very  small 
percentage  of  the  energy  in  teleseismic  coda. 

Implications  for  a  study  of  Lg  propagation 

The  foregoing  discussion  has  brought  out  several  points  relevant  to  the  design  of  an  Lg  propagation 
study  in  which  absolute  Lg  amplitudes  will  be  obtained  by  correction  based  on  diffuse  coda  site 
amplifications.  Such  a  study  would  be  performed  in  order  to  better  quantify  Lg  blockage,  and  would  be 
accomplished  by  comparing  changes  in  absolute  Lg  amplitude  between  stations  with  the  intervening 
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path  structure  (see  chapter  4).  Thus  we  want  to  minimize  anything  else  that  could  effect  the  estimation 
of  absolute  Lg  amplitude. 

Specifically,  the  preceding  discussion  has  indicated  several  ways  to  minimize  variations  in 
incidence  and  frequency  of  the  Lg  observed,  which  will  be  important  in  minimizing  differences  in  the 
amplification  of  different  parts  of  the  signal.  The  existence  of  significant  multipathing  and 
backscattering  in  later  Lg  alerts  us  to  the  importance  of  using  a  fairly  narrow  and  early  group  velocity 
window  for  Lg  measurements.  Otherwise,  differences  in  amplitude  between  stations  could  be  masked 

by  later  arrivals. 

Additionally,  a  narrow  group  velocity  window  would  minimize  amplification  differences  by 
limiting  the  range  of  modes  examined  within  the  dispersive  Lg  train.  Early  and  late  Lg,  even  within  the 
0.6  to  3  Hz  passband,  have  distinctly  different  frequency  contents,  and  so  amplifications.  Also,  early 
and  late  Lg  could  have  different  phase  velocities,  even  at  the  same  frequency,  and  so  would  have 
different  amplifications.  The  practical  lower  limit  to  the  size  of  the  group  velocity  window  will  depend 
on  the  extent  of  variations  in  the  intrinsic  velocity  of  the  media  over  southern  California.  For  our 
purposes  it  appears  to  be  most  important  to  cut  off  the  window  before  the  fundamental  mode  Rayleigh 
wave,  at  approximately  3.0  km/sec  group  velocity,  as  Rg  will  certainly  have  a  different  phase  velocity. 
That  group  velocity  also  coincides  with  the  arrival  of  more  backscattered  energy,  which  we  want  to 
avoid. 

There  are  three  important  reasons  for  limiting  the  passband  of  Lg  in  a  study  of  propagation.  As 
mentioned  earlier,  Lg  is  dispersive.  As  with  the  use  of  a  short  group  velocity  window,  the  purpose  of 
limiting  the  passband  would  be  to  minimize  variation  of  amplification  within  the  signal  used.  Also,  the 
dependence  of  spectra  on  many  source  parameters  also  argues  for  the  use  of  a  narrow  passband,  to 
prevent  differences  between  spectra  of  different  events  causing  different  relative  amplification  and  so 
confounding  interpretation  of  path  effects.  Finally,  both  the  passband  and  group  velocity  window 
should  be  limited  to  minimize  the  variation  in  modes  due  to  their  different  relative  excitation  by 
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different  events.  That  is  desirable  because  the  mechanisms  of  blockage  are  poorly  known  (chapter  4), 
and  it  is  possible  that  the  blockage  of  different  modes  could  depend  on  different  crustal  features. 

The  most  important  consideration  in  the  use  of  teleseismic  coda  as  a  calibration  signal  will  be  the 
separation  of  steeply  incident  P-wave  energy  from  the  locally  scattered  surface  waves.  Also  we  wish  to 
avoid  Rg  contamination  of  the  calibration  signal  by  the  use  of  a  late  coda  window.  Additionally  we 
need  to  minimize  differences  between  the  spectra  of  the  calibration  signals  obtained  from  different 
events  by  using  a  narrow  passband  for  all  records.  The  moderate  dependence  of  scattered  energy 
azimuth  on  source  azimuth  (Bannister  et  al.  1990)  suggests  that  improving  azimuthal  coverage  of 
sources,  by  using  shallow  events,  may  improve  the  accuracy  of  SCSN  site  amplifications.  This  is 
balanced  by  the  recognition  that  shallow  events  may  contain  some  Rg,  even  when  a  late  coda  window  is 
used,  generated  by  scattering  from  the  coherent  coda.  These  theoretical  concerns  will  be  tested  by 
comparison  of  site  amplifications  from  deep  vs.  shallow  event  coda,  and  by  the  diffuse  coda  vs.  local  S- 
wave  coda  site  amplifications.  We  may  also  increase  the  isotropy  of  the  coda  by  the  use  of  late  and  long 
coda  windows. 

Isolation  of  the  diffuse  component  of  teleseismic  coda 

We  are  also  constrained  in  our  choice  of  coda  window  by  the  desire  to  avoid  all  secondary  phases, 
such  as  PP  and  PcP,  which  could  supply  steeply  incident  P-wave  energy.  Our  choice  is  further 
constrained  by  the  length  of  the  trigger  for  a  given  event,  and  by  event  size.  For  large  events  there  is 
frequently  a  trade-off  between  starting  late  enough  that  most  records  are  on-scale,  but  not  so  late  that 
the  end  of  the  event  trigger  time  is  reached  at  many  stations.  For  one  large  deep  event  (m 5=7.0, 
depth=630km,  distance=68°)  with  an  especially  long  trigger  time  on  the  SCSN,  we  are  able  to  use  two 
separate  coda  windows.  The  first  is  from  85  to  1 15  seconds  after  the  initial  P-wave,  and  the  second  is  40 
to  180  seconds  after  the  sP  arrival.  In  the  first  window,  many  records  are  clipped,  but  many  stations  for 
which  the  signal  is  commonly  below  the  background  noise  level  record  good  data.  In  the  second 
window,  the  stations  that  are  clipped  in  the  earlier  window  are  back  on-scale  (the  clipping  is  a  matter  of 
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limited  dynamic  range  of  the  data  acquisition  system,  so  the  records  are  valid  once  the  data  are  again 
on-scale). 

Non-Lg-like  elements  in  the  calibration  signal  from  deep  events  can  be  minimized  by  separating 
the  diffuse  and  the  coherent  coda.  This  separation  also  permits  the  use  of  shallow  events,  providing 
more  complete  and  even  azimuthal  coverage  of  sources  (figure  3).  To  estimate  the  coherent  portion  of 
the  coda,  we  stack  the  seismograms  after  alignment  by  cross-correlation  of  the  initial  P-wave  (and  for 
shallow  events,  on  the  entire  train  of  P,  pP,  and  sP).  In  this  way,  static  station  corrections  are 
automatically  incorporated.  We  make  the  assumption  that  all  near-source  scattered  energy  will  stack 
coherently  and  near-receiver  scattered  energy  will  be  incoherent.  We  then  remove  the  coherent  coda 
from  each  individual  record  by  subtracting  the  scaled  estimate  of  the  coherent  coda  (the  "beam")  from 
the  coda  of  each  individual  record.  Before  subtraction  from  an  individual  trace,  the  beam  is  scaled  by  its 
cross-correlation  with  that  trace,  which  minimizes  the  amplitude  of  the  remaining  energy.  The  result  is 
an  estimate  of  the  diffuse  coda  at  each  station. 

We  carefully  consider  and  test  the  assumption  that  all  near-source  scattered  energy  will  stack 
coherently.  If  scattering  at  several  degrees  distance  from  the  source  were  to  contribute  to  the  coherent 
component  of  coda,  the  time  lag  between  the  initial  P  arrival  and  that  scattered  energy  would  not  be 
constant  over  the  several  hundred  kilometers  spanned  by  the  network  and  so  that  contribution  to  the 
coda  would  not  stack  coherently.  For  example,  for  a  source  50  degrees  from  the  network,  at  600  km 
depth,  and  a  scatterer  3  degrees  from  the  source,  at  the  same  depth  and  in  the  plane  of  the  ray  (to 
maximize  the  variation  in  ray  parameter),  the  difference  in  time  lag  between  the  direct  and  scattered  P 
wave  for  stations  0.1  degrees  apart  in  the  plane  of  the  ray  would  be  0.02  seconds.  For  stations  1  degree 
apart,  the  difference  in  time  lags  would  be  0.19  seconds,  and  so  the  coherence  of  the  scattered  phase 
would  be  degraded,  as  energy  in  the  signal  peaks  at  approximately  one  Hz.  For  stations  3  degrees  apart 
in  the  propagation  direction,  the  difference  in  direct  and  scattered  P  times  would  be  0.6  seconds,  and  so 
when  stacked,  aligned  on  the  initial  P  arrivals,  the  scattered  arrivals  would  be  nearly  180°  out  of  phase 
and  would  largely  cancel.  As  we  want  to  use  only  energy  traveling  laterally  in  the  crust  to  estimate  Lg 
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inverted  triangles.  Symbols  are  scaled  by  magnitude. 
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amplification,  we  are  concerned  about  how  much  steeply  incident  energy  may  exist  that  is  not  removed 
by  subtraction  of  the  beam.  In  a  situation  as  described  above,  where  the  change  in  wavenumber  across 
the  network  of  some  scattered  energy  in  the  coherent  coda  is  significantly  different  from  the  change  in 
the  initial  arrival  wavenumber  across  the  network,  the  correlation  coefficient  between  coda  records 
should  vary  with  station  spacing. 

Figure  4  shows  the  correlation  coefficients  of  coda  windows  for  all  SCSN  station  pairs  plotted  vs. 
interstation  spacing,  for  a  deep  earthquake.  Before  beam  removal  (top),  the  least-squares  fit  line  to  the 
points  is  virtually  horizontal,  and  the  mean  value  is  0.0354;  while  some  non-zero  coherence  is  apparent, 
it  does  not  vary  significantly  with  station  separation.  The  lower  plot  shows  that  the  coherent  portion  of 
the  coda  is  effectively  removed  by  beam  subtraction.  After  beam  subtraction,  the  mean  value  is 
effectively  zero.  We  obtained  similar  results  for  shallow  earthquakes,  with  much  larger  initial  values  of 
mean  coherence,  but  effectively  zero  coherence  after  beam  removal,  indicating  effective  separation  of 
the  coherent  from  the  diffuse  coda  even  for  shallow  earthquakes  (figure  5).  Our  ability  to  perform  this 
separation  on  shallow  events  has  allowed  us  to  improve  our  azimuthal  coverage  of  teleseismic  sources. 
In  a  few  cases,  there  was  a  distinct  slope  to  the  line  fit  to  the  data,  especially  for  shallow  events, 
indicating  that  some  steeply  incident  P-energy  is  present  in  the  calibration  signal  (figure  6).  Based  on 
the  difference  between  coherence  at  adjacent  and  distant  stations,  the  remaining  coherent  coda  appears 
to  account  for  less  than  10%  of  the  total  power  of  the  calibration  signal  in  even  the  worst  case. 

The  mean  ratio  of  power  in  coherent  to  diffuse  coda  for  the  20  deep  events  was  0.097.  That  ratio 
for  the  deep  events  varied  between  0.016  and  0.294.  The  maximum  value  was  for  an  event  at  536  km 
depth  and  117'  distance,  so  the  large  amount  of  coherent  coda  was  likely  due  to  scattering  at  the  core¬ 
mantle  boundary.  Most  other  deep  events  were  less  than  90’  distant  from  southern  California. 

For  the  20  shallow  events,  the  mean  ratio  of  power  in  coherent  to  diffuse  coda  was  0.445.  That 
ratio  for  the  shallow  events  varied  from  0.104  to  1.144.  The  depth  of  the  event  with  the  highest  level  of 
coherent  coda  was  reported  in  the  PDE  catalogue  to  be  17  km.  The  event  was  at  75  distance,  and  the 
coda  window  used  was  chosen  to  start  after  the  predicted  times  for  the  pP,  sP,  and  PcP  phases,  but  a 
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Figure  6.4:  Correlation  coefficients  of  the  teleseismic  coda  from  21  to  61  seconds  after  the  initial  P 
arrival,  for  each  pair  of  SCSN  stations  recording  a  deep  event,  vs.  interstation  spacing,  before  (top) 
and  after  beam  removal  (bottom).  The  mb  6.5,  606  km  deep  event  was  73°  from  southern  California. 
The  mean  correlation  coefficient  was  0.055  before  beam  removal  and  0.008  after,  and  the  slope  was 
virtually  zero,  indicating  complete  removal  of  the  coherent  coda. 
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Figure  6.5:  Correlation  coefficients  of  the  teleseismic  coda  from  20  to  45  seconds  after  the  initial 
P  arrival,  for  each  pair  of  SCSN  stations  recording  a  shallow  event,  vs.  interstation  spacing, 
before  (top)  and  after  beam  removal  (bottom).  The  mb  6.2, 10  km  deep  event  was  510  fr  n 
southern  California.  The  mean  correlation  coefficient  was  0.162  before  beam  removal  and  0.006 
after,  and  the  slope  was  virtually  zero,  indicating  complete  removal  of  the  coherent  coda. 
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Figure  6.6:  Correlation  coefficients  of  the  teleseismic  coda  from  40  to  95  seconds  after  the  initial  P 
arrival,  for  each  pair  of  SCSN  stations  recording  a  shallow  event,  vs.  interstation  spacing,  before 
beam  removal  (top),  and  after  beam  removal  (bottom).  The  Mb  6.5, 19  km  deep  event  was  53°  from 
southern  California.  The  mean  correlation  coefficient  was  0.43 1  before  beam  removal  and  -0.005 
after.  There  is  a  significant  increase  in  the  correlation  coefficients  with  decreasing  interstation 
spacing,  indicating  incomplete  removal  of  the  coherent  coda.  The  zero  distance  intercept  was  0.052. 
In  this  case,  the  coda  window  chosen  included  the  PcP  arrival,  which  is  usually,  but  not  always, 
insignificant,  and  which  may  have  been  the  source  of  the  coherent  coda,  although  PcP  was  not  visible 
in  a  record  section.  Some  other  shallow  events  had  even  more  coherent  coda,  with  no  apparent  cause. 
Those  events  usually  were  less  distant  than  the  average. 
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record  section  of  this  event  shows  clear  arrivals  well  after  those  predicted  (figure  7).  The  source  of  this 
coherent  coda  is  not  known.  Nonetheless,  it  appears  that  the  beamforming  accurately  estimated  the 
coherent  energy,  permitting  its  removal  (figures  8  and  9).  The  danger  in  such  a  situation  is  that  high 
amplitude  coherent  coda  will  generate  a  significant  amount  of  Rg  near  each  station  (although  it  is  not 
clear  whether  Rg  and  Lg  would  have  resolvably  different  amplification  at  the  same  frequencies). 

How  sensitive  are  site  amplifications  to  differences  between  signal  parameters? 

It  is  difficult  to  accurately  predict  differences  in  amplifications  due  to  differences  in  frequency, 
phase  velocity,  or  wavetype,  due  to  the  complexity  of  the  real  earth.  Empirical  comparisons  may 
provide  a  more  useful  test  of  the  sensitivity  of  site  amplifications  to  these  parameters.  To  that  end,  we 
compare  site  amplifications  for  possibly  different  wavetypes,  and  consider  other  studies  that  quantify 
the  effects  of  scattering  near  the  surface. 

Sensitivity  to  wavetype 

An  empirical  study  confirms  the  insensitivity  of  relative  site  amplifications  to  wavetype.  Barker  et 
al.  (1981)  observed  large  differences  between  site  amplifications  for  Lg  in  three  distinctly  different 
types  of  strata  (granite,  alluvium,  and  tuff),  using  nine  3-component  stations  at  NTS,  recording  70 
regional  events.  They  found  similar  differences  in  site  amplifications  for  Pg  but  were  unable  to  discern 
any  difference  in  the  ratio  of  Lg  to  Pg  site  amplification.  The  authors  noted  that  as  Poisson’s  ratio 
increases  significantly  for  the  sediments,  the  expectation  was  that  relative  amplifications  might  differ 
there,  but  there  was  no  such  measurable  effect. 

Comparison  of  site  amplifications  from  diffuse  coda  of  deep  and  shallow  events 

To  test  for  possible  differences  in  amplification  between  the  coda  from  deep  events  and  that  from 
shallow  events,  due  to  contamination  of  the  shallow  event  coda  by  steeply  incident  P-wave  energy  or  by 
Rayleigh  wave  energy  generated  by  very  near-receiver  scattering  from  the  coherent  coda,  we  calculated 
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Figure  6.9:  Correlation  coefficients  of  the  teleseismic  coda  from  25  to  80  seconds  after  the  initial 
P  arrival,  for  each  pair  of  SCSN  stations  recording  the  shallow  event  of  figure  7.  The  increase  in 
correlation  coefficient  with  decreasing  station  spacing  indicates  that  not  all  of  the  steeply  incident 
P-wave  energy  was  coherent  After  beam  removal,  the  zero  distance  intercept  was  0.056.  The  large 
decrease  however,  in  the  mean  value,  from  0.434  before  beam  removal  to  -0.002  after,  indicates 
that  most  of  the  coherent  coda  energy  was  removed. 
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site  amplifications  separately  from  the  subsets  of  deep  and  shallow  events.  There  is  no  apparent 
difference  between  those  results  (figure  10).  The  slight  offset  of  the  diagonal  line  that  the  two  sets  of 
site  amplifications  lie  along  exists  because  the  site  amplifications  can  only  be  resolved  to  within  some 
additive  constant.  In  the  inversion  for  site  amplifications  (Baker  and  Minster,  1996),  the  sum  of  the  log 
of  the  amplifications  is  set  to  zero,  to  prevent  tradeoffs  with  the  event  magnitudes.  The  somewhat  larger 
deep  events  were  recorded  on  a  greater  range  of  stations  than  were  the  smaller  shallow  events,  so  the 
stations  recording  more  deep  events  had  a  slightly  lower  mean  log(amplification)  than  the  stations  that 
recorded  both  deep  and  shallow  events.  Thus  for  stations  common  to  both  data  subsets,  the  mean  of  the 
amplification  estimates  were  higher  for  site  amplifications  estimated  from  the  deep  events  than  were 
those  from  the  shallow  events.  What  is  important  is  that  the  relative  amplifications  are  proportional  for 
both  sets  of  estimates  (i.e.  they  lie  along  a  diagonal  line).  The  slightly  larger  uncertainty  estimates  for 
the  shallow  event  amplifications  reflect  the  typically  smaller  number  of  events  recorded  per  station  for 
the  shallow  event  estimates.  From  this  good  correlation,  we  conclude  that  differences  in  site 
amplifications  due  to  a  small  amount  of  Rg  and  steeply  incident  P-waves  contaminating  the  Lg-like 
calibration  signal  are  less  than  the  resolution  of  the  amplification. 

Comparison  of  site  amplifications  from  diffuse  coda  and  local  S-wave  coda 

Su  and  Aki  (1995)  calculated  site  amplifications  for  158  stations  of  the  SCSN  using  the  S-wave 
coda  of  local  events.  Comparison  of  the  diffuse  teleseismic  coda  site  amplifications  (DTCSAs)  with  Su 
and  Aki’s  local  S-wave  coda  site  amplifications  (LSCSAs)  provides  a  check  on  the  accuracy  and 
validity  of  both  sets,  as  they  were  calculated  from  coda  of  different  event  types  using  different 
estimation  techniques.  Su  and  Aki  estimated  amplifications  for  specific  frequencies,  1.5,  3.0,  6.0,  and 
12  Hz,  so  we  expect  some  variation  between  those  results  and  the  diffuse  coda  results,  estimated  for  the 
0.6  to  3  Hz  passband,  because  of  the  difference  in  frequencies  and  bandwidths  used.  Local  S-wave  coda 
is  largely  composed  of  S  and/or  surface  wave  energy  (e.g.  Wagner  and  Owens,  1993),  and  so  should  be 
of  similar  wavetype  to  diffuse  coda. 


Values  from  Deep  Events 


153 


Figure  6.10:  Comparison  of  diffuse  coda  site  amplifications  calculated  from  just  the  coda  of  the  20 
deep  events  and  those  calculated  from  just  the  coda  of  the  20  shallow  events.  The  vertical  lines 
represent  2  standard  deviations  uncertainty  about  the  deep  event  coda  site  amplifications,  and  the 
horizontal  lines  represent  the  same  for  the  shallow  event  coda  site  amplifications. 
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There  are  126  stations  for  which  Su  and  Aki  reported  an  amplification  and  for  which  there  were  a 
minimum  10  events  recorded  for  the  diffuse  coda  amplifications.  The  DTCSAs  and  LSCSAs  for  those 
stations  correlate  very  well  at  all  but  the  highest  amplifications.  For  the  105  events  with  amplifications 
less  than  3,  the  correlations  are  equally  good  between  the  DTCSAs  and  LSCSAs  as  they  are  between 
LSCSAs  at  different  frequencies  (figure  11) 

The  high  correlations  of  figure  1 1  break  down  at  the  stations  with  the  highest  amplification  (figure 
12).  The  reasons  for  the  breakdown  of  the  correlations  are  probably  different  for  the  two  sets  of 
correlations.  All  of  the  stations  with  much  greater  local  S-wave  coda  site  amplification  at  lower 
frequency  are  in  sediments,  and  most  are  in  the  Imperial  Valley.  The  difference  between  the  1.5  Hz  and 
the  higher  frequency  LSCSAs  may  be  due  resonance  at  1.5  Hz,  or  to  greater  attenuation  of  the  higher 
frequency  signal.  Attenuation  however,  can  not  explain  the  differences  at  those  stations  between  the 
DTCSAs  and  the  1.5  Hz  LSCSAs,  as  the  passband  of  0.6  to  3  Hz  used  for  the  diffuse  coda  spans 
whatever  spectral  averaging  was  used  for  the  1.5  Hz  S-wave.  For  the  same  reason,  if  resonance  were  the 
cause  of  the  extreme  differences  between  the  highest  amplification  DTCSAs  and  LSCSAs,  it  could  not 
be  due  to  the  frequency  content  of  the  signals.  Resonance  at  1.5  Hz  for  local  S-wave  coda  but  not  for 
diffuse  coda  also  seems  unlikely,  as  the  wavetypes  are  very  similar,  if  not  indistinguishable.  The 
differences  between  those  sets  of  site  amplifications  could  be  due  to  differences  in  the  mechanisms  that 
trap  energy  in  sedimentary  basins  from  the  two  sources  of  the  respective  coda  waves.  The  source - 
station  spacing  was  kept  very  small  for  the  S-wave  study  (Su  and  Aki,  1995),  so  presumably,  the 
hypocenters  were  directly  beneath  the  basins.  The  diffuse  coda  energy  would  largely  be  S  or  surface 
waves  scattered  from  a  greater  distance,  mostly  from  outside  the  basin.  While  we  cannot  constrain  the 
problem  further  and  so  speculate  on  details  of  the  mechanisms,  we  can  see  that  horizontally  propagating 
surface  waves  entering  a  basin  are  distinctly  different  from  S-waves  entering  a  basin  from  below. 
Regardless  of  the  difference  between  the  mechanisms  that  trap  more  energy  in  the  basin  in  the  local  S- 
wave  case,  the  surface  wave  source  is  likely  more  appropriate  for  Lg. 


Figure  6.1 1:  Comparison  of  diffuse  coda  site  amplifications  versus  1.5  Hz  local  S-wave  coda  site 
amplifications  of  Su  and  Aki  (1995)  for  sites  with  amplifications  less  than  3.0  (upper  left).  The 
coefficient  of  determination  is  measure  of  how  meaningful  it  is  to  relate  two  variables  by  a 
sloping  line  (i.e.  Y=aX+b).  Specifically,  the  coefficient  of  determination,  r2,  is  given  by 
r2=l-SSE/SST,  where  the  sum  of  the  squared  error,  SSEsZy^-bEyj-aXxjy- ,  is  a  measure  of  how 
much  variation  is  left  unexplained  by  the  model,  and  the  total  squared  error,  SST=£y2-(Eyi)2/n» 
is  a  measure  of  the  total  amount  of  variation  in  the  observed  values  of  the  dependent  variable. 
Thus  SSE/SST  is  the  proportion  of  the  total  variation  that  is  not  predicted  by  the  linear  model, 
and  r2  is  the  proportion  of  variation  in  the  dependent  variable  that  is  predicted  by  the  linear  model. 
The  correlation  between  the  local  S-wave  coda  amplifications  at  1.5  Hz  and  the  diffuse  coda 
amplifications  is  about  as  good  as  between  the  local  S-wave  coda  amplifications  at  1.5  Hz  and  at 
3  Hz  (upper  right).  The  correlation  is  much  poorer  for  larger  differences  in  frequency  at  the  same 
site  (lower  plots). 
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Coefficient  of  Determination  =  0.49 


Local  S  -Wave  Amplifications  at  1 .5  Hz 


Coefficient  of  Determination  =  0.63 


Local  S  -Wave  Amplifications  at  1 .5  Hz 


Coefficient  of  Determination  =  0.82 


Local  S  -Wave  Amplifications  at  1 .5  Hz 
Coefficient  of  Determination  =  0.43 


Local  S-Wave  Amplifications  at  1.5  Hz 


Figure  6.12:  Comparison  of  diffuse  coda  site  amplifications  versus  local  1.5  Hz  S-wave  coda  site 
amplifications  of  Su  and  Aki  (1995)  for  all  amplifications  (upper  left).  The  linear  relationship 
seen  in  figure  9  breaks  down  for  the  highest  amplification  sites,  with  approximately  5  times  higher 
amplification  estimated  from  the  1.5  Hz  local  S-wave  coda  than  from  the  diffuse  coda.  The  local 
S-wave  coda  site  amplifications  are  also  much  greater  at  1 .5  and  3  Hz  (upper  right),  than  at  6  and 
12  Hz  (lower  plots). 
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Conclusions 

We  have  calculated  site  amplifications  from  the  diffuse  component  of  teleseismic  coda  for  189 
stations  of  the  SCSN.  Reviews  of  previous  research  on  site  amplification  and  the  nature  of  Lg  and 
diffuse  coda  indicate  that  they  are  appropriate  for  Lg  amplification. 

Several  pieces  of  evidence  indicate  that  site  amplifications  are  insensitive  to  wavetype  within  the 
resolution  of  studies  performed.  For  example,  relative  amplifications  for  a  range  of  site  geologies  were 
found  to  be  indistinguishable  for  Pg  and  Lg  phases.  The  diffuse  coda  site  amplifications  are  the  same 
for  deep  and  shallow  event  coda,  despite  some  known,  but  generally  minor,  contamination  of  the 
shallow  event  coda  with  Rg  and  steeply  incident  P-wave  energy.  Site  amplifications  are  also  similar  for 
diffuse  coda  and  local  S-wave  coda.  As  the  local  S-wave  coda  has  been  shown  to  correlate  well  with 
direct  crustal  S-wave  amplifications,  the  higher  mode  and  direct  S  wave  amplifications  must  also  be 
quite  similar. 

Even  if  there  were  some  sensitivity  of  amplification  to  wavetype,  diffuse  coda  and  Lg  appear  to 
have  similar  source  generation  (scattering  from  Rg  to  higher  mode  surface  waves,  or  direct  P-  to  S- 
wave  scattering  within  a  shallow  low  velocity  surface  layer),  and  so  should  be  composed  of  the  same,  or 
very  similar,  wavetypes.  This  hypothesis  is  confirmed  by  array  studies,  which  indicate  that  both  Lg  and 
diffuse  coda  are  composed  of  S-waves  trapped  in  the  crust,  or  equivalently,  higher  mode  surface  waves. 
Finally,  we  effectively  remove  the  great  majority  of  coherent  teleseismic  coda  from  the  calibration 
signals. 

Diffuse  coda  is  extensively  scattered,  as  evidenced  by  array  studies  for  which  most  of  the  energy  in 
late  coda  is  incoherent  and  can  not  be  modeled  in  terms  of  plane  waves,  and  so  is  fairly  isotropic. 

The  site  amplification  estimation  worked  well,  as  the  diffuse  coda  site  amplifications  are  well 
correlated  with  local  S-wave  coda  site  amplifications,  except  at  the  highest  amplification  sites.  The 
differences  between  the  two  sets  of  site  amplifications  for  the  highest  amplification  sites,  which  were  in 
sedimentary  basins,  may  be  due  to  resonance  in  the  local  S-wave  case  because  the  earthquake  source 
was  directly  beneath  the  basin.  This  indicates  that  recalculating  the  amplifications  for  diffuse  coda. 
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rather  than  simply  using  the  local  S-wave  coda  results,  may  have  been  important  in  avoiding  later 
misinterpretation  of  propagation  results  in  basins. 

Examination  of  the  nature  of  Lg  and  diffuse  coda,  and  results  of  the  amplification  estimation, 
suggest  important  considerations  for  the  design  of  a  study  that  will  allow  observation  of  changes  in 
absolute  Lg  amplitudes  with  propagation. 

Site  amplifications  are  most  sensitive  to  frequency,  and  possibly  also  to  incidence  angle,  so  a 
narrow  passband  should  be  used  for  both  Lg  and  diffuse  coda  site  amplifications,  to 

1)  minimize  variations  of  frequency  due  to  Lg  dispersion, 

2)  minimize  variations  of  frequency  due  to  differences  between  sources  (both  Lg  and  diffuse  coda),  and 

3)  avoid  possible  differences  in  frequency  content  of  Lg  at  different  stations  (for  different  events)  due 

to  different  attenuation  characteristics  along  different  paths. 

The  sensitivity  of  site  amplifications  to  frequency  and  incidence  also  suggests  that  Lg  should  be 
observed  in  an  early  and  narrow  group  velocity  window,  to 

1)  minimize  variations  in  frequency  and  wavenumber  (and  so  incidence)  due  to  Lg  dispersion; 

2)  minimize  variations  in  frequency  and  wavenumber  due  to  the  excitement  of  different  modes  by 
different  sources; 

3)  minimize  the  effect  multi-pathed  arrivals  in  later  Lg  could  have  on  recognizing  blockage;  and 

4)  avoid  the  inclusion  of  Rg,  as  it  likely  scatters  differently  than  the  higher  modes.  There  is  also  a  small 

chance  that  it  has  a  different  amplification  than  the  higher  modes. 

The  choice  of  Lg  group  velocity  window  will  be  limited  by  the  extent  of  velocity  variation  across  the 
region  studied. 

Additionally,  the  diffuse  coda  calibration  signal  should  be  taken  from  a  late  coda  window  to 
minimize  the  influence  of  very  near  station  Rg  in  the  calibration  signal,  and  maximize  the  isotropy  of 


scattered  energy. 
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Chapter  7 


Conclusion  to  the  Dissertation 

Introduction 

In  this  chapter  I  review  what  has  been  accomplished.  The  research  described  in  chapters  2  and  3 
and  in  appendix  1  is  complete  in  and  of  itself,  and  I  summarize  those  results.  Similarly,  chapters  5  and  6 
describe  results  that  are  important  in  and  of  themselves.  In  addition,  those  results  have  considerable 
ramifications  for  improving  the  accuracy  of  regional  seismic  discrimination.  Such  improvement  is  the 
long  term  objective  which  provided  the  motivation  for  the  work  described  in  the  later  chapters. 

The  chapter  is  organized  as  follows:  I  first  briefly  review  the  state  of  knowledge  regarding  regional 
propagation  and  blockage,  and  its  presumed  role  in  misclassification  by  regional  discriminants.  I  also 
discuss  what  was  learned  by  evaluating  measurements  of  Lg/Pg  amplitude  ratios  recorded  at  Southern 
California  Seismic  Network  (SCSN)  stations.  I  describe  how  that  led  to  the  decision  that  determining 
site  amplifications  at  SCSN  stations  was  one  of  the  most  important  steps  that  could  be  taken  towards 
better  understanding  of  Lg  blockage  and  improvement  of  discrimination  accuracy.  In  the  section 
following  that,  as  a  proof-of-concept  test,  I  apply  the  site  amplifications  to  SCSN  recordings  of  three 
regional  earthquakes  and  one  nuclear  explosion,  and  demonstrate  that  they  perform  as  predicted,  i.e. 
application  of  site  amplification  corrections  permits  us  to  separate  path  from  receiver  effects.  In  the 
penultimate  section,  I  discuss  the  direction  of  future  research  that  has  been  made  possible  by  the  results 
presented  in  chapters  5  and  6.  In  the  concluding  section,  I  summarize  the  results  that  have  been 
achieved  in  each  portion  of  the  dissertation. 

What  we  learned  that  indicated  that  site  amplifications  are  important 

As  noted,  our  long  term  objective  has  been  to  improve  the  accuracy  of  regional  seismic 
discrimination.  The  initial  step  toward  achieving  that  objective  was  assessing  the  current  state  of 
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knowledge  regarding  regional  propagation  and  determining  how  best  to  improve  it.  In  doing  the 
research  necessary  to  making  that  determination,  I  also  contributed  further  to  that  base  of  knowledge.  I 
first  identified  gaps  and  possible  misconceptions,  or  at  least  untested  assumptions,  in  the  current  state  of 
knowledge  regarding  regional  propagation.  It  is  well  established  that  the  best  existing  regional  seismic 
discriminants  have  a  high  rate  of  error  (e.g.  Taylor,  et  al.,  1989).  It  has  been  also  clearly  recognized  that 
propagation  effects  are  important  in  some  cases  and  that  path  corrections  could  reduce  the  incidence  of 
misclassifications  (e.g.  Zhang,  et  al.,  1994).  Our  initial  contributions  to  understanding  the  problem  of 
reducing  errors  in  regional  event  classification  were  the  demonstrations  that  (1)  changes  in  Lg/Pg 
amplitude  ratios  occur  over  very  short  spatial  scales,  -20  km,  often  not  clearly  attributable  to  major 
tectonic  boundaries,  and  that  (2)  areas  in  which  the  discriminant  value  has  been  affected  enough  to 
cause  misclassification  are  spatially  coherent  (figures  4.3-4.10).  Analysis  of  patterns  of  Lg/Pg 
amplitude  ratios  for  regional  sources  from  a  wide  variety  of  azimuths  about  the  SCSN  also  let  us 
conclude  that  path  effects  are  the  dominant  cause  of  misclassification  by  the  Lg/Pg  discriminant. 

In  addition  to  indicating  the  direction  that  further  research  should  take,  the  conclusions  discussed, 
above  are  relevant  to  understanding  the  limitations  of  ongoing  efforts  at  regionalization  (e.g.  Dowla  et 
al.,  1996,  Randall,  et  al.,  1996)  and  of  attempts  at  developing  empirical  path  corrections. 
Regionalization  is  an  attempt  to  quantify  path  properties,  and  so  path  effects,  in  politically  critical 
regions  (e.g.  the  Middle  East,  North  Africa,  and  China).  This  is  probably  most  effective  in  areas  of 
simple  structure  with  a  single  structural  feature  that  disrupts  regional  propagation,  such  as  the  Arabian 
shield,  which  is  bounded  by  narrow  sections  of  oceanic  crust  (Vernon  et  al.,  1996).  In  the  area  of 
empirically  derived  path  corrections,  Zhang  et  al.  (1996)  achieved  a  22%  reduction  in  variance  for  the 
Lg/Pg  amplitude  ratio  using  a  correction  based  on  the  product  of  distance  and  topographic  roughness. 
Path  corrections  that  are  transportable,  that  is,  corrections  that  may  be  applied  in  uncalibrated  regions, 
are  the  most  valuable  to  us.  How  far  afield  from  the  calibration  paths  such  topography-based 
corrections  could  be  transported  is  not  clear.  The  observations  of  abrupt  changes  in  Lg/Pg  amplitude 
ratios  over  very  short  paths  suggest  that,  assuming  reciprocity  (as  I  have  used  many  closely  spaced 
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stations  and  few  events),  in  a  structurally  complex  area  like  southern  California,  exact  knowledge  of  the 
path  effect  from  one  source  position  to  one  receiver  position  would  have  minimal  predictive  value  for 
another  source  30  km  distant  from  the  calibration  source  position.  Thus,  to  ensure  a  large  improvement 
in  discriminant  capability,  empirical  calibration  of  path  effects  may  have  to  be  performed  on  a  much 
finer  scale  than  is  practical.  The  alternative  to  fine-scale  calibration  over  much  of  the  globe  is  to  attempt 
to  understand  the  physical  basis  for  the  fine  scale  variations  observed  in  regional  phase  propagation. 
Such  understanding  may  permit  the  prediction  of  those  fine  scale  changes,  and  at  the  very  least  should 
make  clearer  the  limitations  of  path  corrections  and  so  enable  better  estimation  of  the  uncertainty  in 
amplitude  ratios. 

I  began  this  section  by  discussing  amplitude  ratios.  The  blockage  of  Lg  however  has  long  been  a 
topic  of  great  interest  on  its  own.  In  much  of  the  work  on  Lg  based  regional  discriminants,  it  appears  to 
be  a  common  implicit  assumption  that  variations  in  discriminant  values  are  due  largely  to  changes  in 
Lg.  Partly  because  of  this,  most  of  the  work  on  regional  phase  propagation  and  blockage  has  focused  on 
Lg,  as  discussed  in  chapters  4  and  6.  While  Lg  blockage  and  variations  in  Lg/Pg  amplitude  ratios  are 
related  topics,  it  does  not  follow,  nor  has  it  been  demonstrated,  that  Lg  blockage  alone  accounts  for 
misclassifications  by  the  Lg/Pg  discriminant.  It  is  largely  an  unresolved  question  whether  or  not  Lg  and 
Pg  suffer  similar  blockage,  with  second  order  effects  on  their  amplitudes  causing  variations  in  their 
amplitude  ratios.  To  understand  the  physical  basis  for  the  changes  in  Lg/Pg  amplitude  ratios,  it  is  clear 
that  it  will  be  necessary  to  resolve  the  question  of  how  each  phase  varies,  and  I  return  to  this  topic  in  the 
next  section. 

From  our  review  of  the  literature  and  our  observations  of  Lg/Pg  amplitude  ratios  across  southern 
California,  I  determined  that,  to  significantly  improve  regional  seismic  discrimination,  we  would  have 
to  understand  the  physics  of  the  propagation  of  both  Lg  and  Pg  and  not  just  empirically  quantify  their 
relative  variation.  A  review  of  the  literature  on  blockage  turned  up  an  extensive  body  of  theoretical 
studies  of  Lg  propagation  and  blockage,  but  no  observations  that  provided  appropriate  constraints  for 
such  studies.  For  that,  it  would  be  essential  to  know  the  instrument  calibration  and  site  amplifications  at 
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all  recording  stations  so  that  absolute  amplitude  measurements  could  be  made,  and  site  effects  removed. 
Because  I  had  also  determined  that  Lg/Pg  amplitude  ratios  varied  over  very  short  spatial  scales,  it  was 
clear  that  observations  of  changes  in  absolute  amplitudes  of  Lg  and  Pg  would  have  to  be  made  at  the 
same  short  scale;  hence  the  decision  to  estimate  site  amplifications  appropriate  for  all  possible  SCSN 
stations. 

I  began  by  estimating  site  amplifications  appropriate  for  Lg.  This  was  a  major  undertaking  which 
began  with  a  thorough  review  of  the  nature  of  both  Lg  and  teleseismic  coda  and  of  which  parameters 
are  important  in  controlling  site  amplification.  The  work  also  required  the  development  of  complex 
statistical  tools  for  the  inversion  of  doubly  censored  data  from  a  very  heavy-tailed  distribution,  and  the 
development  of  a  methodology  for  isolating  the  near-receiver-scattered  component  of  teleseismic  coda 
from  the  near-source-scattered  component,  to  say  nothing  of  the  extensive  data  processing  required. 
That  is  the  work  described  in  chapters  5  and  6,  and  summarized  in  the  final  section.  The  results 
obtained  are  important  both  for  the  tools  developed  and  the  results  obtained. 

The  estimation  of  site  amplifications  concludes  the  research  accomplished  for  the  dissertation,  but 
as  I  have  presented  the  work  in  the  broader  context  of  understanding  regional  propagation,  I  will  discuss 
how  the  results  will  be  applied  in  the  future.  We  have  been  laying  the  groundwork  for  ultimately 
improving  the  performance  of  the  Lg/Pg  regional  discriminant,  by  permitting  observation  of  changes  in 
just  the  amplitude  of  Lg  over  short  path  segments  and  so  developing  the  data-based  constraints  on 
modeling  that  are  currently  lacking.  The  first  question  to  ask  then  is  whether  the  site  amplification 
corrections  work.  Do  they  indeed  permit  separation  of  the  site  effects  from  propagation  effects  on 
amplitude?  I  apply  the  site  amplification  corrections  to  regional  data  as  a  proof-of-concept  test,  and 
discuss  the  results  of  that  test  and  their  implications  for  future  research. 

Application  of  site  amplifications 

I  apply  the  site  amplification  corrections  to  the  records  from  three  regional  earthquakes  in  distinctly 
different  source  regions  and  one  nuclear  explosion  from  the  Nevada  Test  Site  (NTS).  As  discussed 
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above,  understanding  Lg  blockage  and  understanding  variations  in  Lg/Pg  amplitude  ratios  are  two 
different  issues;  to  understand  how  different,  I  applied  the  site  amplification  corrections  to  both  Lg  and 
Pg  amplitude  measurements.  In  applying  the  corrections  to  both  Lg  and  Pg,  I  assume  that  site 
amplifications  are  the  same  for  both  phases,  as  was  found  to  be  the  case  for  a  variety  of  site  geologies 
(Barker,  et  al.,  1981).  Even  if  the  assumption  breaks  down  at  some  level  of  resolution,  it  is  unlikely  that 
differences  would  be  great  enough  to  affect  the  large-scale  pattern  of  amplitude  variations  observed.  I 
assume  equal  amplifications  for  Lg  and  Pg  only  for  this  step,  which  is  largely  a  proof-of-concept  test 
for  our  site  amplification  corrections.  More  rigorous  investigation  will  be  required  before  inferences  can 
confidently  be  made  based  on  fine  details  of  variations  between  Lg  and  Pg  amplitudes. 

Mendocino  earthquake:  The  first  example  of  the  correction  for  site  amplifications  is  for  event  1  of 
figures  4.2  and  4.9,  a  mb  4.2  earthquake  near  Cape  Mendocino.  In  figure  7.1  I  show  demeaned 
log(Lg/Pg)  amplitude  ratios,  which  makes  the  pattern  of  amplitude  ratio  variation  more  discernible. 
Crosses  no  longer  represent  stations  specifically  classifying  an  event  as  an  earthquake,  as  they  did  in 
figure  4.9,  but  indicate  larger  than  average  Lg/Pg  amplitude  ratios,  which  are  more  earthquake-like 
values.  Similarly,  circles  represent  the  more  explosion-like  smaller  Lg/Pg  amplitude  ratios.  I  have  used 
just  the  earliest  part  of  the  Lg  window,  from  3.6  to  3.3  km/sec  group  velocity,  in  order  to  avoid  masking 
of  blockage  or  attenuation  of  Lg  by  the  arrival  of  later  multi-pathed  or  back- scattered  energy.  The 
observed  patterns  of  variation  however  are  generally  indistinguishable  from  those  for  a  group  velocity 
of  3.6  to  2.8  km/sec. 

Figure  7.2  shows  the  Lg  amplitudes,  corrected  for  site  amplification,  from  the  same  event.  These 
values  are  also  demeaned  to  make  the  variations  more  easily  discernible.  The  absolute  values  of  Lg 
amplitudes  for  this  event  range  from  zero  (i.e.  the  pre-event  noise  level)  to  nearly  2000  nm/sec.  The 
pattern  is  dominated  by  a  decrease  in  Lg  amplitude  with  distance,  as  expected.  There  is  also  a  difference 
between  amplitudes  on  either  side  of  the  San  Andreas  fault  with  smaller  Lg  on  the  southwest  side. 
Figure  7.3,  the  Pg  amplitudes  corrected  for  site  amplification,  shows  that  a  similar  but  less  pronounced 
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Figure  7.1:  Log(Lg/Pg)  amplitude  ratios  recorded  at  SCSN  stations  for  event  number  one  of  figure  4.2.  The  values  are  demeaned,  with  values 
greater  than  the  mean  plotted  as  crosses  and  those  less  than  the  mean  plotted  as  circles.  Symbols  are  scaled  by  distance  from  the  mean.  Arrow 
indicate  the  propagation  direction. 
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Figure  12:  Lg  amplitudes,  after  correction  for  site  amplifications,  for 
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Figure  7.3:  Pg  amplitudes,  after  correction  for  site  amplifications,  for  event  one. 
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difference  is  apparent  in  the  Pg  amplitudes.  This  suggests  that  similar  mechanisms  are  affecting 
amplitudes  of  both  phases,  but  to  a  different  extent.  In  the  area  of  smallest  log(Lg/Pg)  in  figure  7.1,  the 
same  area  in  which  stations  misclassifying  the  event  are  located  (figure  4.9),  the  absolute  Pg  amplitudes 
are  not  smaller  to  the  southwest  of  the  San  Andreas  fault.  Thus  the  misclassifications  may  be  due  to 
greater  blockage  of  Lg  than  Pg  in  that  area.  Conversely,  the  area  of  greatest  Lg/Pg  amplitude  ratio 
(figure  7.1)  appears  to  be  due  to  a  much  faster  decay,  or  blockage,  of  Pg  than  of  Lg. 

Eastern  Sierra  Earthquake:  Figure  7.4  shows  the  demeaned  log(Lg/Pg)  amplitude  ratios  for  event  3 
of  figure  4.2.  The  smallest  Lg/Pg  amplitude  ratios  are  in  the  southernmost  part  of  the  array,  and  as  seen 
in  figure  4.8,  the  difference  in  amplitude  ratios  is  great  enough  there  for  the  event  to  be  misclassified. 
Although  the  distance  effect  dominates  figures  7.5  and  7.6,  the  site-corrected  Lg  and  Pg  amplitudes, 
there  is  a  significant  difference  between  the  two  plots.  Pg  transmission  appears  to  be  much  more 
efficient  than  Lg  transmission  throughout  a  north-south  corridor  at  about  117“  longitude.  For  this 
earthquake,  it  appears  that  event  misclassification  using  the  log(Lg/Pg)  discriminant  is  attributable 
directly  to  Lg  blockage. 

Nuclear  explosion  at  NTS:  In  contrast  to  the  situation  with  earthquake  records,  stations 
misclassifying  nuclear  explosions  have  earthquake-like,  large  Lg/Pg  ratios.  The  demeaned  log(Lg/Pg) 
amplitude  ratios  (figure  7.7)  for  this  explosion  (the  same  explosion  as  in  figure  4.3c)  vary 
approximately  inversely  to  those  of  the  eastern  Sierra  earthquake  (cf.  figure  7.4).  The  large  Lg/Pg 
amplitude  ratios  are  in  a  corridor  running  south  from  NTS  through  the  eastern  Mojave  block  and  on 
down  through  the  Peninsular  Range.  The  plots  of  site-corrected  Lg  and  Pg  (figure  7.8  and  7.9 
respectively)  indicate  that  the  difference  is  due  to  especially  good  Lg  transmission.  Lg  amplitudes  are 
quite  high  (figure  7.8),  while  Pg  amplitudes  (figure  7.9)  are  average.  These  figures  also  indicate  that  Lg 
and  Pg  both  suffer  an  abrupt  change  in  amplitude  along  the  most  northerly  paths  recorded,  which  cross 
the  southern  Sierra,  the  Great  Basin,  and  eventually,  the  San  Andreas  fault.  For  those  paths,  both  phases 
were  equally  affected  and  the  discriminant  values  remained  accurate. 
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Figure  7.4:  Log(Lg/Pg)  amplitude  ratios  for  event  number  three  of  figure  4.2 
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Baja  earthquake:  As  was  observed  in  the  records  for  the  three  previous  events  discussed,  there  is  a 
sharp  separation  between  large  and  small  Lg/Pg  amplitude  ratios  (figure  7.10)  recorded  for  this  Baja 
California  earthquake  (event  six  of  figure  4.2).  For  the  most  part,  the  differences  observed  in  the 
amplitude  ratios  are  not  obvious  in  the  images  of  site-corrected  Lg  and  Pg  amplitudes,  which  are  very 
similar  (figures  7.11  and  7.12).  Both  figures  indicate  that  propagation  of  the  crustal  phases  is  much  less 
efficient  in  the  thinning  crustal  waveguide  near  the  coast. 

Discussion  of  the  application  of  site  amplification  corrections  and  future  work 

In  the  preceding  section,  I  have  demonstrated  that  the  site  amplification  corrections  computed  for 
the  SCSN  stations  enable  isolation  of  propagation  effects  on  regional  phases.  Application  of  the  site 
amplification  corrections  indicate  that  observed  variations  in  the  log(Lg/Pg)  amplitude  ratio 
discriminant  can  be  better  understood  by  analyzing  their  separate  amplitude  variations  over  a  variety  of 
paths.  Specifically,  for  paths  along  different  azimuths,  through  varying  structure,  we  have  seen 

1)  path  segments  where  both  Lg  and  Pg  are  blocked,  but  fairly  equally,  so  that  the  Lg/Pg  discriminant 
remains  effective, 

2)  paths  where  Lg  appears  to  be  blocked  to  a  greater  extent  than  Pg,  so  that  the  discriminant  value  is 
lowered  and  earthquakes  may  be  misclassified  as  explosions,  and 

3)  paths  where  Pg  appears  to  be  blocked  to  a  greater  extent  than  Lg,  so  that  the  discriminant  value  is 
increased  and  explosions  may  be  misclassified  as  earthquakes. 

Although  it  is  tempting  to  speculate  on  details  of  the  amplitude  variations  observed  in  these  figures, 
it  is  easy  to  be  misled  by  preconceptions  and  I  intend  to  systematically  quantify  the  observations  for  a 
number  of  events.  Specifically,  I  intend  to  develop  an  interpolation  scheme  so  that  changes  in  Lg 
amplitude  may  be  estimated  along  short  path  segments  in  the  propagation  direction  and  correlated  with 
path  properties.  This  will  provide  the  observations  of  Lg  amplitude  changes  necessary  to  constrain 
models  of  Lg  propagation,  which  is  essential  to  understanding  Lg  blockage  and  to  improving  the 
performance  of  regional  seismic  discrimination. 
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Several  further  projects  are  indicated  as  well,  still  specifically  related  to  the  goal  of  improving 
nuclear  verification.  The  next  step  should  be  a  rigorous  assessment  of  the  relative  site  amplifications  of 
Lg  and  Pg.  Then,  a  study  of  Pg  propagation,  similar  to  that  described  above  for  Lg,  should  be 
performed.  Only  when  the  mechanisms  of  Lg  and  Pg  blockage  are  better  understood  will  it  be  possible 
to  assess  their  relative  efficiency  in  blocking  each  phase  and  so  develop  highly  accurate,  transportable 
path  corrections  for  the  discriminant. 

Conclusions 

I  will  recapitulate  what  has  been  achieved  in  the  course  of  completing  this  dissertation.  The  first 
half  of  the  thesis  dealt  with  the  receiver  function  technique.  We  first  improved  the  calculation  of 
receiver  functions  by  developing  a  simultaneous  time-domain  deconvolution  (Appendix  1).  We  then 
provided  a  thorough  analysis  of  the  resolution  of  the  technique,  which  indicated  that  receiver  function 
amplitudes  are  extremely  poorly  constrained  and  should  not  be  used  to  infer  anything  more  than  relative 
amplitude  or  dip  of  interfaces.  This  is  important  because  it  has  been  common  to  make  direct  inferences 
of  dip  and/or  velocity  jumps  from  receiver  function  amplitudes.  On  the  other  hand,  we  demonstrated 
that  the  arrival  times  of  receiver  function  phases  are  quite  robust  to  additive  noise  and  the  effects  of 
deconvolution  (Chapter  3).  We  extended  the  technique’s  application  to  an  area  with  complex  Moho 
topography  by  constraining  possible  models  with  other  observations  of  the  same  data,  including 
azimuthal  variations  in  receiver  function  waveforms,  and  polarization  of  the  initial  unprocessed  P-wave. 
From  the  technique’s  application  to  data  recorded  at  Pinon  Flat  Observatory  (PFO),  on  the  boundary 
between  the  rifting  Salton  Trough  and  the  10,000  foot  Peninsular  Range,  we  were  able  to  make  several 
important  inferences  regarding  the  structure  of  the  region.  Specifically,  the  Moho  appears  to  be 
gradational  beneath  the  Salton  Trough,  and  undergoes  very  rapid  spatial  changes  in  depth,  apparently  in 
large  step  offsets  that  maintain  Airy  isostasy.  In  the  mid-crust  we  identified  a  significant  low-velocity 
zone  which  may  correspond  to  a  highly  conductive  layer  indicated  by  a  nearby  resistivity  profile, 
possibly  due  to  the  pooling  of  saline  fluids  at  the  base  of  the  pluton  on  which  PFO  rests. 


182 


The  latter  half  of  the  thesis  dealt  with  site  amplifications  at  the  SCSN.  We  adapted  statistical 
techniques  to  the  estimation  of  site  amplifications  and  network  magnitudes.  These  techniques  could 
improve  the  accuracy  of  parameter  estimates  in  other  areas  of  geophysical  data  analysis,  as  they  deal 
effectively  with  two  problems  common  in  geophysical  data.  One  technique,  robust  reweighting  of  data, 
is  straightforward  in  its  application  and  can  lead  to  significant  improvement  in  the  accuracy  of 
parameters  estimated  from  large  sets  of  data  drawn  from  a  heavy-tailed  distribution.  The  second  method 
allows  the  incorporation  of  censored  data  into  parameter  estimates.  Application  of  this  technique  is 
more  involved  than  the  first,  but,  where  necessary,  it  can  prevent  biasing  of  parameter  estimates.  The 
site  amplifications  themselves  are  important  for  hazard  assessment.  We  have  verified  previously 
estimated  site  amplifications,  significantly  expanded  the  base  of  known  site  amplifications,  and  perhaps 
most  importantly,  indicated  a  difference  in  the  mechanisms  by  which  resonance  in  sedimentary  basins  is 

generated. 

In  determining  that  the  best  way  to  improve  the  performance  of  the  Lg/Pg  regional  discriminant 
would  be  to  enable  observation  of  variations  of  Lg  blockage  over  a  fine  spatial  scale,  which  the 
estimation  of  site  amplifications  does,  we  made  several  improvements  in  our  understanding  of  regional 
propagation.  Specifically,  we  showed  that  path  effects  dominate  misclassification  for  most  events  and 
that  changes  in  discriminant  value  are  due  to  path  effects  that  occur  over  -20  km  length  scales.  We 
demonstrated  a  need  for  the  quantification  of  blockage,  and  that  site  amplifications  were  necessary  for 
that.  From  application  of  the  site  amplification  corrections,  we  showed  that  changes  in  the  discriminant 
value  are  not  necessarily  due  just  to  Lg  blockage,  but  to  changes  in  amplitudes  of  both  phases.  The 
application  of  the  corrections  showed  that  by  calculating  site  amplifications  for  the  SCSN,  we  have 
made  it  possible  to  measure  changes  in  absolute  Lg  amplitudes  due  to  propagation  effects  at  the  fine 
spatial  scale  at  which  Lg  blockage  occurs.  Such  measurements  will  provide  the  constraints  necessary 
for  further  progress  in  modeling,  and  so  in  understanding,  Lg  blockage. 
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Receiver  functions  are  produced  by  deconvolving  the  horaontal  comPonents 
seismoeram  bv  the  vertical  component.  Typically  this  ts  performed  by  spec  a 
division  of  recording  of  single  events.  Receiver  functions  from  events  »«h 
backazimuth  and  rav  parameter  are  then  stacked  to  improve  the  signal-to-no  se 
ratio.  To  avoid  the  subjective  and  time-consuming  method  of  prewhitenmg  «>P«“  y 
employed  with  spectral  division,  we  have  cast  the  deconvolut.on  in  the  time  domain. 
Bv  performing  the  time-domain  deconvolution  as  a  regularized  simultaneous 
inversion  for  a  group  of  events  that  would  normally  have  been  stacked  afte 
deconvolution,  we  find  that  side  lobes  are  reduced  and  resolution  is  impro i  ed. 
Furthermore,  the  regularized  inversion  allows  the  user  to  choose  among  a  variety  of 
objective  model  norms.  In  this  paper,  we  present  results  from  inversions  emplo>.ng 
the  L2  norm  and  lower-bounded  least  squares. 
u-nrric>  <Wnn  volution.  receiver  function. 


INTRODUCTION 

Receiver  functions  are  produced  by  deconvolving  the  radial 
and  transverse  horizontal  components  of  a  seismogram  by 
the  vertical  component,  thereby  isolating  P-to-5  converted 
phases  and  reverberations  ending  in  an  5  phase  (Langston 
19S9  1981.  1979;  Owens.  Crosson  &  Hendrickson  1988: 
Owens  &  Crosson  19SS;  Owens.  Taylor  &  Zandt  19S7; 
Owens.  Zandt  &  Taylor  1984).  To  improve  the  signal-to- 
noise  ratio,  receiver  functions  are  often  binned  by  similar 
backazimuth  and  ray  parameter  (henceforth  we  will  use  the 
term  'bin*  for  this  grouping  of  receiver  functions  or 
seismograms),  and  then  slacked.  It  is  well  known  that  in  the 
presence  of  noise,  deconvolution  is  unstable  (Sipkin  & 
Lcrner-Lam  1992).  The  instabilities  of  deconvolution  due  to 
noise  are  usually  addressed  by  prewhitening  the  time  senes 
(usually  for  frequency-domain  deconvolution;  e.g.  Owens  ei 
at.  19S4)  or  applvin*  damped  least  squares  (e.g.  Oldenburg 
1981;  Sipkin  &  Lerner-Lam  1992).  We  have  found  that  the 
side  lobes  resulting  from  the  deconvolution  of  a  single 
seismoeram.  even  with  careful  application  of  prewhitening, 
are  coherent  across  the  several  receiver  functions  produced 
for  a  given  bin  and,  therefore,  stack  coherently.  To  improve 
the  sienal-to-noise  ratio  as  a  step  in  the  deconvolution,  we 
perform  a  simultaneous  deconvolution  of  events  that  would 
normally  be  stacked  after  deconvolution,  thereby  reducing 
the  need  for  damping  (prewhitening).  Our  approach  is  akin 
to  that  of  Oldenburg  (1981)  who  described  a  frequency- 
domain  multichannel  deconvolution.  In  particular,  we  a\e 


modified  the  production  of  receiver  functions  by  time- 
domain  simultaneous  deconvolution  (Ammon  1991)  to 
include  damping  terms  similar  to  those  outlined  by  Sipkin  & 
Lerner-Lam  (1992)  for  single-channel  deconvolution.  e 
will  also  give  a  brief  illustration  of  the  ease  with  which 
different  penalty  functions  can  be  applied  with  the 
lime-domain  approach  by  showing  results  of  a  lower, 
bounded  least-squares  deconvolution. 

In  regions  with  horizontally  stratified  geology,  all  of  the 
converted  S  energy  will  be  found  on  the  radial  components 
In  areas  with  dipping  layers  or  anisotropy,  the  tangential 
components  of  the  receiver  functions  become  important  as 
well.  The  deconvolution  procedure  is.  however,  the  same  lor 
both  the  radial  and  transverse  components  of  the  receiver 
functions.  A  major  problem  in  producing  receiver  functions 
is  to  distinguish  individual  peaks  that  can  be  obscured  by 
neighbouring  larger  peaks  or  their  side  lobes.  Such  problems 
areVeatest  on  the  radial  components  since  they  have  larger 
arrivals  than  do  the  transverse  ones.  Since  this  paper  focuses 
on  the  deconvolution  technique  rather  than  on  the 
interpretation  of  receiver  functions,  we  will  only  deal  with 
radial  components  in  the  examples. 


PROBLEMS  NOTED  WITH  THE  EXISTING 
DECONVOLUTION  METHOD 


Deconvolution  is  typically  formulated  in  the  frequency 
domain  in  terms  of  spectral  division.  A  common 
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implementation  for  receiver  functions  (e.g.  Owens,  Taylor  & 
Zandt  19S3:  Owens  et  at.  19S4)  uses  the  form: 


r  —  ^  .  x-/ 

VV*  +  H* 

where  r  is  the  Fourier  transform  of  the  receiver  function,  h 
and  v  are  the  Fourier  transforms  of  the  horizontal  and 
vertical  components  of  the  seismogram  respectively.  * 
indicates  complex  conjugation,  and  w  is  a  prewhitening 
function.  The  prewhitening  is  carried  out  by  replacing  the 
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Figure  1.  The  lop  panel  shows  receiver  functions  computed  by 
frequency-domain  deconvolution  of  14  events  recorded  at  Arti, 
Russia  (ARU).  The  middle  panel  shows  the  stack  of  these  14 
receiver  functions  and  the  traces  representing  ±  two  standard 
deviations  of  the  mean.  The  bottom  panel  shows  the  receiver 
function  computed  using  simultaneous  time-domain  deconvolution 
of  the  same  14  events  and  the  traces  representing  ±  two  standard 
deviations  of  the  mean. 


power  level  vv*  by  a  so-called  ‘water  level’  anywhere  it  falls 
below  a  specified  value,  usually  a  given  fraction  of  the  peak 
power  level  (e.g.  Owens,  Taylor  &  Zandt  1983).  By  filling 
the  troughs  in  the  denominator  of  eq.  (1),  we  avoid  spurious 
peaks  that  may  appear  in  r  and  cause  ringing  in  the  receiver 
functions.  Receiver  functions  are  computed  for  a  range  of 
water  levels,  and  a  ‘best’  water  level  is  selected  on  an 
individual  basis,  usually  according  to  rather  subjective 
criteria. 

The  top  panel  of  Fig.  1  show’s  14  receiver  functions 
computed  from  seismograms  recorded  at  Arti,  Russia 
(ARU)  using  the  frequency-domain  deconvolution  de¬ 
scribed  above.  The  middle  panel  depicts  the  stack  of  these 
14  receiver  functions  along  with  traces  representing  ±2 
standard  deviations  (determined  by  a  jackknife  technique, 
Efron  <t  Tibshitani  3986)  of  the  mean  about  the  stack.  We 
observe  a  broad  trough  on  either  side  of  the  peak  at  Os 
which  illustrates  that  artefacts  of  deconvolution  tend  to  be 
coherent  features  and  are  enhanced  by  stacking  after 
deconvolution.  Since  this  peak  is  the  first  arrival  we  can  be 
sure  that  the  trough  preceding  it  is  a  result  of  deconvolution. 
Because  we  expect  side  lobes  to  be  symmetric  about  the 
main  peak  we  suspect  that  later  peaks  and  troughs  are 
impinged  upon  by  the  side  lobes  of  the  primary  peak.  Sipkin 
&  Lerner-Lam  (1992)  outline  this  problem  in  the  context  of 
deconvolution  of  instrument  response  from  seismograms 
and  describe  techniques  to  minimize  this  effect  on  individual 
seismograms.  By  adapting  their  approach  for  simultaneous 
deconvolution,  we  significantly  reduce  these  side  lobes.  The 
lowermost  panel  of  Fig.  1  shows  a  receiver  function 
computed  using  the  simultaneous  deconvolution  method 
described  below  along  with  the  standard  deviation  of  the 
mean  computed  as  in  the  middle  panel. 

In  this  figure,  we  observe  that  the  side  lobes  leading  into 
the  initial  peak  can  be  greatly  diminished  by  greater  care  in 
the  deconvolution.  A  lower  standard  deviation  does  not 
necessarily  indicate  a  better  receiver  function — especially  in 
view  of  the  fact  that  the  repeatable  features  between 
receiver  functions  may  result  from  artefacts  in  the 
deconvolution.  The  simultaneous  deconvolution  does, 
nevertheless,  result  in  lower  standard  deviations  in  the 
pre-event  noise  as  well  as  in  the  primary  peak  at  Os  and  in 
the  Ps  conversion  from  the  Moho  at  5  s.  The  lowermost 
panel  also  exhibits  greater  resolution  of  the  small  peak  at  2  s 
which  only  appears  as  a  shoulder  to  the  main  peak  in  the 
frequency-domain  stack.  On  the  lowermost  panel  w’e  do, 
however,  notice  that  the  standard  deviation  of  the  mean 
about  troughs  from  7  to  13  s  is  greater  than  on  the  stacked 
receiver  functions  computed  individually  by  frequency- 
domain  deconvolution. 

As  with  any  study  using  real  data,  visual  inspection  of  the 
seismograms  is  required.  In  the  case  of  receiver  functions  we 
typically  determine  the  ‘usable’  data  after  applying  the 
deconvolution.  We  compute  individual  receiver  functions 
using  the  frequency-domain  deconvolution  with  all  data 
available  from  a  given  station  and  only  present  results  from 
those  data  which  produce  a  strong  first  arrival  at  time  0.  The 
purpose  of  the  simultaneous  lime-domain  deconvolution  is 
then  to  be  able  to  apply  an  objective  (reproducible) 
criterion  for  the  damping  function  (prewhitening)  and  to  be 
able  to  apply  damping  functions  that  are  tailored  to  mitigate 
specific  problems  presented  by  a  given  data  set. 
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SIMULTANEOUS  DECONVOLUTION 

By  casting  the  deconvolution  as  a  linear  inverse  problem  we 
are  able  to  use  more  objective  stabilization  procedures  than 
the  somewhat  subjective  method  of  prewhitening  described 
in  eq.  (1).  For  the  ideal  (noise  free)  case,  we  can  write  the 
convolution  problem  as: 

Vr  =  h.  <2) 

where  h  (/  elements)  can  be  either  the  radial  or  transverse 
component  of  the  seismogram,  r  (m  elements)  is  the 
corresponding  component  of  the  receiver  function,  and  V  is 
composed  of  m  columns  each  containing  the  vertical 
component  of  the  seismogram  (v  with  n  elements)  padded 
with  zeros  to  fill  the  columns  to  a  length  i: 


In  the  absence  of  noise,  we  could  solve  eq.  (2)  directly  for 
the  receiver  function  (similarly  for  noise-free  data  a  stable 
solution  can  be  found  in  principle  by  using  a  water  level  of 
0.0  in  eq.  1).  As  in  the  case  of  similar  inversions  (e.g. 
Constable  1987;  Sipkin  &  Lerner-Lam  1992),  noise  is 
present  in  the  h  term,  therefore  an  exact  solution  of  eq.  (2) 
would  overfit  the  data.  It  is  desirable  to  find  a  solution  in 
which  the  rms  difference  between  the  observations  h  and 
predicted  data  Vr  take  an  appropriate  value,  such  as  the 
standard  deviation,  T,  of  the  pre-event  noise.  We  therefore 
seek  a  receiver  function  r  which  satisfies  the  following 
equation. 

llVr-hll2-  7*-0.  (4) 

The  notation  flx)|  in  the  above  equation  denotes  the  L2 
norm  of  the  vector  x.  In  the  deconvolution  for  receiver 
functions,  noise  is  also  present  in  V  and  we  do  not  always 
expect  to  satisfy  this  constraint  exactly.  The  solution  of  eq. 
(4)  for  r  is  not  unique  and  may  be  unstable.  Therefore,  we 
stabilize  the  problem  by  solving  for  a  receiver  function  that 
satisfies  a  particular  constraint.  For  the  examples  given  in 
this  paper,  we  seek  the  ‘smallest’  receiver  function,  in  which 
the  ‘size*,  U,  is  defined  by: 

llrf  =  U.  (5) 

By  multiplying  eq.  (4)  by  a  Lagrange  multiplier  (/i  ')  and 
adding  it  to  eq.  (5)  we  make  the  trade-off  between  both 
constraints  explicit  by  minimizing  the  quantity 

llrf  +  ^’illVr-M’-n-U.  (6) 

By  differentiating  the  left-hand  side  with  respect  to  r  and 
setting  the  result  to  zero,  we  obtain  a  vector  r  which  yields  a 


Simultaneous  time-domain  deconvolution 

stationary  value  of  U  (Constable  et  al.  1987;  Sipkin  & 
Lerner-Lam  1992). 

r-(l  +  /i*,VTV)-V,VTh  (7) 

I  is  the  m  Xm  identity  matrix.  In  an  ideal  situation,  the 
appropriate  value  for  /x  would  be  that  which  satisfies  the 
constraint  given  by  eq.  (4).  However,  as  mentioned  above, 
there  is  noise  in  both  h  and  V  in  eq.  (7).  as  well  as  basic 
imprecision  in  the  formulation  of  receiver  functions 
(resulting  from  the  fact  that  the  vertical  component  of  the 
seismogram  contains  some  5-wave  energy,  horizontal 
components  contain  P- wave  energy,  event  binning  is  not  a 
perfect  procedure,  etc.).  For  these  reasons,  we  typically  do 
not  find  exact  solutions  to  eq.  (4).  Instead,  we  select  an 
appropriate  value  for  /x  by  repealing  the  inversion,  each 
time  using  different  values  of  jx.  until  convergence  on  a 
stable  value  of  misfit  (see  discussion  below). 

To  extend  this  method  for  the  simultaneous  deconvolu¬ 
tion  of  several  ( N )  events  we  modify  eq.  (2)  to  read: 


In  which  each  V,,  and  h;  (J  =  1 . N)  are  the  same  as  V. 

and  h,  defined  above,  for  the  ;th  seismogram.  Following  the 
same  steps  as  outlined  in  eqs  (2)  through  (7)  we  obtain: 

r(l  +  *-’£v7V,)Viv7h,  (9) 

v  y-i  '  >-> 

Simultaneous  deconvolution  (eq.  9)  requires,  therefore,  no 
larger  a  matrix  inversion  than  does  the  deconvolution  of  a 
single  seismogram. 

Figure  2  shows  the  misfit  (first  term  of  cq.  4)  versus  model 
size  (eq.  5)  for  the  simultaneous  time-domain  deconvolu¬ 
tions  used  to  compute  the  lowermost  receiver  function  in 
Fig.  1.  In  this  case,  we  started  by  computing  a  receiver 
function  using  p.  *  10’°.  We  then  computed  additional 
receiver  functions  by  repeated  inversions,  each  time 
reducing  the  value  of  fx  by  one  order  of  magnitude.  This 
process  was  repeated  until  the  value  of  misfit — the  rms 
difference  between  the  radial  components  of  the  observed 
seismograms  and  those  computed  by  convolving  the  receiver 
function  with  the  respective  vertical  components — was 
decreased  from  that  of  the  previous  receiver  function  by  less 
than  0.05  per  cent  (typically  resulting  in  a  value  for  /x  of 
between  100  and  1.0).  The  final  receiver  function  depicted  in 
Fig.  1  was  computed  using  /x  =  100. 

In  the  next  section,  the  misfit  values  shown  above  the 
receiver  functions  in  each  figure  are  normalized  by  the 
standard  deviations  of  the  pre-event  noise,  T J.  For  the  ideal 
case  in  which  eq.  (4)  is  satisfied,  the  normalized  misfit  would 
have  a  value  of  1.0.  It  is  obvious  from  Fig.  2,  in  which  we 
converged  on  a  value  of  misfit  close  to  15,  that  it  is  usually 
not  possible  to  reach  the  ideal  value  of  1.0. 

This  method,  as  well  as  the  frequency-domain  method, 
are  both  examples  of  a  damped  least-squares  deconvolution. 
In  the  frequency-domain  inversion  white  noise  is  added  (up 
to  a  given  'water  level’)  to  stabilize  the  inversion.  The 
time-domain  inversion  is  stabilized  by  only  fitting  the  data  to 
a  specified  misfit  by  using  a  Lagrange  multiplier  to  balance 
misfit  with  some  measure  of  model  size  (norm).  In  this 
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Figure  2.” Misfit  versus  model-norm  size  for  various  Lagrange 
multipliers  (/*)  applied  to  the  simultaneous  deconvolution  to 
compute  the  receiver  function  shown  on  the  bottom  of  Fig.  1.  The 
misfit  is  defined  as  the  rms  difference  between  the  observed  radial 
component  of  the  seismograms  and  those  predicted  by  a 
rcconvolution  of  the  receiver  function  with  the  vertical  component. 
The  model-norm  size  for  this  example  is  defined  as  the  rms  sum  of 
all  the  components  of  the  receiver  function.  To  compute  the 
receiver  function  shown  on  the  bottom  panel  of  Fig.  1  we  used 

fi  -  102. 

sense,  the  Lagrange  multiplier  serves  the  same  purpose  as 
the  ‘water  level*  in  the  frequency-domain  deconvolution. 
The  advantages  are: 

(1)  that  we  stabilize  the  inversion  by  making  use  of  some 
property  of  the  receiver  function; 

(2)  that  the  convergence  criterion  can  be  specified 
objectively  and  in  simple  terms  so  that  the  results  are 
repeatable  by  other  investigators. 

Which  particular  property  of  the  receiver  function  to  use  as 
the  damping  function  is  user-selectable,  but,  as  we  will 
demonstrate  in  the  following  section,  this  ‘model  norm*  can 
be  tailored  to  address  specific  problems. 

The  equivalent  expression  to  eq.  (9)  for  a  frequency- 
domain  simultaneous  deconvolution  (using  the  same 
notation  as  in  eq.  1)  is  given  by: 


2  V* 
/»  1 


It  is  obvious  from  eq.  (10)  that  simultaneous  deconvolu¬ 
tion  is  equivalent  to  summing  the  cross-correlation  of  the 
vertical  (input  to  the  system)  channels  with  the  correspond¬ 
ing  horizontal  (the  observed  output  of  the  system)  and 
normalizing  the  result  by  the  damped  (prewhitened)  sum  of 
autocorrelations  of  the  vertical  components  to  produce  a 
receiver  function  (the  system).  By  summing  prior  to 
.performing  the  inversion,  we  improve  the  signal-to-noise 
ratio  of  the  auto-coirelaiion  and  cross-correlation  functions 
and  reduce  the  amount  of  damping  (prewhitening)  needed. 
We  recognize  that  the  simultaneous  frequency-domain 
deconvolution  would  be  faster  than  the  time-domain 
deconvolution,  but  it  still  requires  the  subjective  ‘water 
level'  prewhitening.  As  stated  earlier,  our  objective  is  to 
take  advantage  of  the  more  objective  and  versatile 
regularization  in  the  lime  domain.  In  the  following  section, 
we  compare  results  of  the  simultaneous  time-domain 
deconvolution  with  those  of  single-event  frequency-domain 
deconvolution  followed  by  stacking. 

EXAMPLES 

As  a  first  test  we  applied  the  simultaneous  deconvolution 
(cq.  9)  to  a  synthetic  data  set.  We  computed  a  synthetic 
seismogram  for  a  hypothetical  earth  structure  (Table  1. 
Baker  ei  al  1995)  with  a  delta-function  source.  To  produce 
the  ‘true  radial  receiver  function’  for  this  structure  (top 
receiver  function  on  Figs  3  and  4),  we  deconvolved  the 
radial  component  of  the  delta  function  synthetic  seismogram 
by  the  vertical.  We  then  constructed  25  synthetic 
seismograms  by  convolving  25  different  observed  P-wave 
trains  with  the  vertical  and  radial  components  of  the  above 
delta-function  synthetic  seismogram.  We  then  added  25 
different  observed  vertical  and  radial  seismic-noise  samples 
to  each  of  the  25  synthetic  seismograms  respectively.  All  the 
P- wave  and  noise  samples  were  recorded  at  the  Pirfon  Flat 
(PFO)  broad-band  seismic  station.  The  signal-to-noisc  ratios 
of  this  synthetic  data  set,  determined  by  the  ratio  of  the 
peak  amplitude  to  the  standard  deviation  of  the  pre-event 
noise,  ranged  from  about  25  to  as  low  as  5.  This  range  is 
typical  of  observed  seismograms. 

The  time-domain  deconvolution  used  in  this  paper  is 
based  on  a  linear  convolution  for  the  forward  problem  (eqs 
2  and  3)  in  order  to  avoid  effects  of  wrap  around.  In  this 
type  of  deconvolution  the  receiver-function  length  is  equal 
to  the  difference  in  the  length  of  the  horizontal  and  vertical 
components.  A  conservative  approach  to  prepare  the  data 
before  deconvolution  would  be  first  to  cut  the  seismograms 
to  as  great  a  time  length  as  possible  before  encountering  the 
direct  5  phase,  and  then  shorten  the  vertical  component  by 
the  desired  length  of  the  receiver  function.  This  would  not 


Table  1.  Velocity 
seismograms. 

model  used  to 

generate  the  synthetic 

Vs 

Vp 

layer  thickness 

(km/sec) 

(km/sec) 

(km) 

3.0 

"  5.4 

3.4 

3359 

6.1 

5.8 

2.667 

4.8 

5.0 

"3359 - 

6.1 

8.0 

4.33 

7.5 

oo 

(10) 
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Figure  3.  The  uppermost  ‘idealized  receiver  function’  was  produced 
by  deconvolution  of  noise-free  synthetic  seismograms  (assuming  a 
delta-function  source).  The  bottom  and  middle  receiver  functions 
were  computed  by  simultaneous  deconvolution  of  five  and  25 
synthetic  seismograms  (described  in  the  text)  respectively.  The 
numbers  plotted  above  each  of  the  receiver  functions,  on  this  as 
well  as  all  the  following  figures,  are  the  rms  misfits  between  all  of 
the  observed  horizontal  components  of  the  seismograms  and  the 
convolution  products  of  the  receiver  function  and  the  respective 
vertical  components.  The  seismograms  were  weighted  for  the  rms 
misfit  calculation  just  as  they  were  for  the  deconvolution. 
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Figure  4.  The  uppermost  'idealized  receiver  function’  was  produced 
bv  deconvolution  of  noise-free  synthetic  seismograms  (assuming  a 
delta-function  source).  The  second  and  third  receiver  functions  from 
the  top  were  computed  by  stacking  25  and  five  receiver  functions 
(respectively),  each  computed  by  single-event  frequency-domain 
deconvolution  of  the  same  synthetic  seismograms  as  used  to 
produce  Fig.  2.  The  bottom  receiver  function  was  computed  by  the 
frequency-domain  deconvolution  of  the  uncut  vertical  components 
(see  text)  from  the  respective  25  horizontal  components  followed  by 
slacking. 


only  result  in  a  very  large  matrix,  but  would  include  an 
unnecessary  amount  of  noise.  Better  results  would  be 
obtained  by  cutting  the  vertical  component  at  the  end  of  the 
p. wave  coda  and  then  cutting  the  horizontal  components  to 
a  length  equal  to  that  of  the  vertical  plus  the  length  of  the 
desired  receiver  function.  We  have  found  that  problems  in 
reproducing  the  synthetics  can  only  arise  by  cutting  the  data 
Too  short.  As  a  result  we  suggest  cutting  the  data  long  and 
tapering  the  last  5  s  with  a  cosine  function.  We  used  the  cut 
vertical  components  in  all  the  time-domain  and  frequency- 
domain  examples  to  follow  unless  otherwise  specified.  In  the 
case  of  spectral-division  deconvolution,  this  required 
padding  the  shortened  vertical  component  with  zeros  to  give 
both  components  an  equal  number  of  samples. 

The  receiver  functions  depicted  in  Fig.  3  w-ere  computed 
by  simultaneous  deconvolution  of  five  (bottom)  and  25 
(middle)  of  the  above  synthetic  seismograms.  Fig.  4  depicts 
receiver  functions  computed  by  spectral-division  deconvolu¬ 
tion  followed  by  stacking  of  the  same  five  (second  from 
bottom)  and  25  (second  from  top)  synthetics  used  in  the 
computation  of  Fig.  3.  The  lowermost  receiver  function  in 
Fig.  4  was  computed  by  spectral-division  deconvolution  of 
the  above  25  synthetics  followed  by  stacking,  using  the 
entire  vertical  component  (without  cutting  and  padding  it 
with  zeros).  The  rms  misfit  between  the  predicted  and 
observed  radial  component  of  the  seismograms  correspond¬ 
ing  to  each  receiver  function  are  given  in  these  figures.  We 
note  that  the  receiver  functions  computed  by  simultaneous 
deconvolution  using  five  and  25  synthetics  are  very  similar. 


both  resolve  all  the  major  peaks  observed  in  the  ‘true  radial 
receiver  function— the  only  major  difference  is  a  lower 
noise  level  on  the  receiver  function  computed  with  all  25 
synthetic  seismograms.  Considering  that  for  a  given 
backazimuth  and  ray  parameter  we  usually  do  not  have  as 
many  as  10  seismograms,  it  is  encouraging  that  simultaneous 
deconvolution  produces  virtually  the  same  receiver  function 


Time  (seconds) 


Figure  5.  (Top)  receiver  function  produced  by  simultaneous 
time-domain  deconvolution  of  25  events  recorded  at  PFO.  (Bottom) 
receiver  function  was  computed  by  stacking  25  receiver  functions 
computed  individually  from  the  same  recordings  by  spectral 
division.  Each  of  these  traces  was  normalized  by  their  peak 
amplitude. 
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for  five  seismograms  as  for  all  25.  In  the  lowermost  receiver 
function  of  Fig.  4,  we  resolve  only  the  largest  peaks 
observed  on  the  ‘true  radial  receiver  function',  whereas  in 
the  case  of  the  two  middle  receiver  functions  we  resolve 
most  of  the  features  observed  in  Fig.  3.  This  indicates  that 
much  of  the  improved  resolution  is  the  result  of  windowing 
the  vertical  component  more  carefully  to  include  only  the 
P-wave  source.  The  stacked  receiver  functions  (Fig.  4) 
produced  using  both  five  and  25  of  the  synthetics  exhibit 
noticeable  side  lobes  leading  into  all  the  peaks  (most 
noticeable  on  the  small  peaks  between  8  and  12  s)  whereas 
the  peaks  on  the  receiver  functions  produced  by 
simultaneous  deconvolution  (Fig.  3)  are  generally  sharper 
and.  with  the  exception  of  the  main  peak  at  Os.  have  no  side 
lobes.  The  small  peak  immediately  behind  the  first  arrival  is 
much  sharper  on  the  time-domain  receiver  functions  than  on 
those  of  Fig.  4.  The  time-domain  receiver  functions 
computed  using  five  and  25  synthetic  seismograms  (Fig.  3) 
have  about  6  per  cent  lower  rms  misfits  than  those  of 
the  respective  stacked  frequency-domain  receiver  function 
(Fig.  4). 

Figure  5  depicts  a  receiver  function  computed  by  the 
simultaneous  time-domain  deconvolution  of  215  events 
recorded  at  PFO  (top)  and  the  slack  of  25  receiver  functions 
computed  individually  by  spectral  division  (bottom).  There 
were  more  than  25  seismograms  available  for  this 
backazimuth  and  ray  parameter,  but  we  selected  a  subset  of 
25  usable  seismograms.  This  selection  was  performed  by 
computing  the  individual  frequency-domain  receiver  func¬ 
tions  and  stacking  those  which  had  clear  first  arrivals 
(bottom.  Fig.  5).  Only  those  receiver  functions  used  in  the 
stacked  receiver  function  were  included  in  the  simultaneous 
deconvolution  (top.  Fig.  5).  The  receiver  function  computed 
by  simultaneous  deconvolution  appears  to  have  broader 
frequency  content  but  the  stacked  receiver  function  has 
larger  amplitudes  (relative  to  the  main  phase  at  time  =  0)  on 
almost  all  peaks. 

Negative  troughs  in  receiver  functions  arc  usually  the 
result  of  a  Ppss  reverberation  (or  other  higher  order 


Figure  6,  Receiver  functions  computed  by  simultaneous  lower- 
bounded  deconvolution  of  the  25  synthetic  seismograms  used  in 
computing  Figs  3  and  4.  The  top  receiver  function  has  a  lower 
bound  of  zero.  The  lower-bound  constraint  decreases  from  top  to 
bottom.  The  bottom  receiver  function  has  no  lower-bound 
constraint.  Numbers  above  the  traces  are  the  rms  misfits. 
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Figure  7.  Receiver  functions  computed  by  simultaneous  lower- 
bounded  deconvolution  of  the  same  25  seismograms  as  used  in 
computing  Fig.  5.  The  top  receiver  function  has  a  lower  bound  of 
zero.  The  lower-bound  constraint  decreases  from  top  to  bottom. 
The  bottom  receiver  function  has  no  lower-bound  constraint 
Numbers  above  the  traces  are  the  rms  misfits. 

reverberations)  or  of  a  F-to-S  conversion  (Fs)  from  a 
velocity  inversion. 

To  demonstrate  an  advantage  of  the  time-domain 
formulation  of  the  simultaneous  deconvolution,  we  employ  a 
lower-bounded  least-squares  algorithm  (Lawson  &  Hanson 
1974)  to  solve  the  inverse  problem  (eq.  9)  and  thereby  test 
the  possibility  that  the  negative  troughs  are  an  artefact  of 
the  inherent  non-uniqueness  of  the  deconvolution  problem. 
Fig.  6  shows  receiver  functions  computed  by  lower-bounded 
least  squares  (of  the  same  25  synthetic  seismograms  as  used 
in  the  previous  examples)  with  more  constraining  lower 
bounds  from  bottom  to  top.  We  observe  that  as  negative 
troughs  are  truncated  by  the  lower  bound,  spurious  positive 
peaks  are  added  to  the  receiver  function.  This  occurs  before 
any  significant  variation  in  rms  misfit  is  observed.  Similar 
behaviour  is  observed  in  the  receiver  functions  calculated 
from  the  25  real  seismograms  (used  in  Fig.  5)  computed 
using  increasing  lower  bounds  from  bottom  to  top  (Fig.  7). 
The  similarity  in  growth  of  spurious  peaks  as  a  result  of  the 
truncation  of  troughs  in  both  observed  and  synthetic 
receiver  functions  as  a  response  to  impinging  lower  bounds 
suggests  that  the  negative  troughs  at  1.2  and  4.3  s  in  the 
observed  receiver  functions  are  in  fact  required  to  produce  a 
reasonable  receiver  function. 

CONCLUSION 

Our  goals  in  applying  the  simultaneous  time-domain 
deconvolution  were  to:  (1)  reduce  the  side  lobes  about  the 
main  peak  of  the  receiver  functions;  (2)  to  eliminate  the 
human  intervention  needed  in  selecting  a  water  level  in  the 
frequency-domain  deconvolution.  As  with  any  study,  human 
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intervention  is  still  necessary  to  inspect  the  data  and  prepare 
it  prior  to  deconvolution. 

Further,  a  substantial  advantage  of  time-domain  decon¬ 
volution  is  the  flexibility  it  alio**  in  applying  different  model 
norms.  In  the  examples  given  above,  we  minimized  a  linear 
combination  of  misfit  and  model  size.  The  example  using  a 
lower  bound  constraint  on  the  resulting  receiver  function 
demonstrates  the  flexibility  of  the  time-domain  deconvolu¬ 
tion  to  determine  the  validity  of  troughs  in  the  receiver 
functions.  Sipkin  &.  Lerner-Lam  (1992)  describe  alternate 
roughening  and  smoothing  norms  that  may  also  be 
appropriate  depending  on  the  problem.  Finally,  simul¬ 
taneous  deconvolution  improves  resolution  of  closely  spaced 
phases. 
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